Compare commits

...

48 Commits

Author SHA1 Message Date
Bilal Catic
5bc0d4f8c2 Merge branch 'implement-renting-option-frontend' into 'master'
Implement renting option frontend

See merge request saburly/marketalarm/web!65
2019-10-31 18:12:42 +00:00
Bilal Catic
026d7cded7 change price slider options for renting option 2019-10-31 19:06:44 +01:00
Bilal Catic
9612b28c91 disable real estate type selection after first click 2019-10-31 19:06:44 +01:00
Bilal Catic
aab32fc608 disable button on first click 2019-10-31 19:06:44 +01:00
Bilal Catic
d32b98bb7b implement Rent option on the frontend 2019-10-31 19:06:44 +01:00
Bilal Catic
5817964b50 remove disabled css 2019-10-31 19:06:44 +01:00
Bilal Catic
59565885cb extend AD_TPYE in db model 2019-10-31 19:06:44 +01:00
Bilal Catic
127691f7bb extend AD_TYPE enum 2019-10-31 19:06:44 +01:00
Bilal Catic
4318fa8a2d extend AD_TYPE enum in specific crawler files 2019-10-31 19:06:44 +01:00
Bilal Catic
6261408a59 Merge branch 'implement-renting-option' into 'master'
Implement renting option - crawler part

See merge request saburly/marketalarm/web!64
2019-10-31 18:05:54 +00:00
Bilal Catic
ecc5b174a0 implement RENT option for Aktido; implement force crawl option 2019-10-30 17:23:43 +01:00
Bilal Catic
fa712ce97d implement RENT option for Rental; implement force crawl option 2019-10-30 15:53:11 +01:00
Bilal Catic
3abbed183e implement RENT and REQUEST option for OLX; implement force crawl option 2019-10-30 15:03:59 +01:00
Bilal Catic
97d93a3f37 add force crawl ENV option for OLX 2019-10-30 15:02:54 +01:00
Bilal Catic
3bb67a4db9 add REQUEST category 2019-10-30 15:02:31 +01:00
Bilal Catic
f181450aa9 fix slider box input - handle one input grater/smaller than the other 2019-10-30 11:23:07 +01:00
Bilal Catic
caec7b6554 Merge branch 'add-textbox-input-for-sliders' into 'master'
add input box for sliders

See merge request saburly/marketalarm/web!63
2019-10-30 09:56:58 +00:00
Bilal Catic
9033114545 add input box for sliders 2019-10-30 10:54:05 +01:00
Bilal Catic
cbbed137e6 Merge branch 'fix-invalid-email-crash' into 'master'
Fix invalid email crash

See merge request saburly/marketalarm/web!62
2019-10-30 09:01:38 +00:00
Bilal Catic
dd8e4d77ed improve email regex; improve error handling for query review 2019-10-28 12:34:14 +01:00
Bilal Catic
43877820cf validate real estate type selection 2019-10-28 10:59:08 +01:00
Bilal Catic
c6aeef10e8 Merge branch 'add-aktido-crawler' into 'master'
Add aktido crawler

See merge request saburly/marketalarm/web!60
2019-10-28 09:47:31 +00:00
Bilal Catic
1e36cb8423 add ALL category option for Rental agency 2019-10-28 09:24:08 +01:00
Bilal Catic
2c2fcd648f remove scrapeAd logging 2019-10-28 09:23:51 +01:00
Bilal Catic
5b6886f52b add ALL categories option for Aktido agency 2019-10-28 09:20:03 +01:00
Bilal Catic
f899c96dc6 add crawler and crawler config for Aktido agency 2019-10-28 09:14:45 +01:00
Bilal Catic
f5d912f02c Merge branch 'add-crawler-for-prostor-page' into 'master'
Add crawler for prostor page

See merge request saburly/marketalarm/web!59
2019-10-25 10:21:24 +00:00
Bilal Catic
747ebb88e5 add debugging log switch for crawler process 2019-10-25 11:08:52 +02:00
Bilal Catic
7e3b0bfcd5 implement crawler for Prostor agency 2019-10-25 10:54:08 +02:00
Bilal Catic
05fad652c4 add PROSTOR agency enum; update ENV template 2019-10-25 10:53:44 +02:00
Bilal Catic
5098b08b3f add ALL option to crawler cat, exclude from real estate types list 2019-10-24 17:43:14 +02:00
Bilal Catic
6fc4218e39 add config files for Prostor agency 2019-10-24 17:11:12 +02:00
Bilal Catic
935ae60ae1 move specific crawler config to the separated files 2019-10-24 16:57:23 +02:00
Bilal Catic
e82a0cfba4 Merge branch 'add-crawler-for-rental-page' into 'master'
Add crawler for rental page

See merge request saburly/marketalarm/web!58
2019-10-24 13:20:40 +00:00
Bilal Catic
2064d40985 stop "rental" crawler if there are no new real estates on the page 2019-10-24 11:26:11 +02:00
Bilal Catic
a6336b7d27 implement crawler for "rental" agency 2019-10-24 11:26:11 +02:00
Bilal Catic
ec798fe94c add crawler config and include specific crawler for "rental" agency 2019-10-24 11:26:11 +02:00
Bilal Catic
abc591749e Merge branch 'update-success-page' into 'master'
Update success page

See merge request saburly/marketalarm/web!57
2019-10-24 05:38:10 +00:00
Bilal Catic
d344d939bb move android GIF to the center 2019-10-24 07:37:18 +02:00
Bilal Catic
d4aec2f643 show both GIF instructions on desktop, but only android gif for mobile 2019-10-24 07:32:40 +02:00
Bilal Catic
7b02f3225b add GIF instructions on success page 2019-10-24 07:01:45 +02:00
Bilal Catic
617cf43bca Merge branch 'improve-social-media-sharing-preview' into 'master'
Improve social media sharing preview

See merge request saburly/marketalarm/web!56
2019-10-21 08:50:44 +00:00
Bilal Catic
8a217cc377 add meta tags for better social media sharing link preview 2019-10-21 10:50:27 +02:00
Bilal Catic
b1ec1a030f Merge branch 'add-renting-soon-option' into 'master'
add segmented control for ad type selection

See merge request saburly/marketalarm/web!55
2019-10-21 08:49:55 +00:00
Bilal Catic
3830c5f257 add "uskoro" text to the renting option 2019-10-21 10:08:12 +02:00
Bilal Catic
d10540c631 add segmented control for ad type selection 2019-10-21 08:05:13 +02:00
Bilal Catic
9dcb27291b Merge branch 'move-locate-me-button' into 'master'
Move locate me button

See merge request saburly/marketalarm/web!54
2019-10-18 12:35:52 +00:00
Bilal Catic
6a2cb18cf1 Merge branch 'move-locate-me-button' into 'master'
Fixed a couple of things like (locate me, location edit)

See merge request saburly/marketalarm/web!53
2019-10-18 12:09:50 +00:00
29 changed files with 1727 additions and 177 deletions

View File

@@ -1,12 +1,21 @@
const PRICE_SLIDER_OPTIONS = { const PRICE_SLIDER_OPTIONS_SALE = {
start: [50000, 85000], start: [50000, 85000],
range: { range: {
min: [0], min: [0],
max: [300000] max: [300000]
}, },
step: 1000, step: 1000,
connect: true, connect: true
tooltips: true };
const PRICE_SLIDER_OPTIONS_RENT = {
start: [300, 500],
range: {
min: [0],
max: [2000]
},
step: 50,
connect: true
}; };
//This will be used for Flats, Apartments, Houses //This will be used for Flats, Apartments, Houses
@@ -17,8 +26,7 @@ const HOME_SIZE_SLIDER_OPTIONS = {
max: [400] max: [400]
}, },
step: 5, step: 5,
connect: true, connect: true
tooltips: true
}; };
const GARDEN_SIZE_SLIDER_OPTIONS = { const GARDEN_SIZE_SLIDER_OPTIONS = {
@@ -28,8 +36,7 @@ const GARDEN_SIZE_SLIDER_OPTIONS = {
max: [10000] max: [10000]
}, },
step: 100, step: 100,
connect: true, connect: true
tooltips: true
}; };
const LAND_SIZE_SLIDER_OPTIONS = { const LAND_SIZE_SLIDER_OPTIONS = {
@@ -39,8 +46,7 @@ const LAND_SIZE_SLIDER_OPTIONS = {
max: [100000] max: [100000]
}, },
step: 100, step: 100,
connect: true, connect: true
tooltips: true
}; };
const GARAGE_SIZE_SLIDER_OPTIONS = { const GARAGE_SIZE_SLIDER_OPTIONS = {
start: [10, 20], start: [10, 20],
@@ -49,8 +55,7 @@ const GARAGE_SIZE_SLIDER_OPTIONS = {
max: [150] max: [150]
}, },
step: 2, step: 2,
connect: true, connect: true
tooltips: true
}; };
const GARAGE_PRICE_SLIDER_OPTIONS = { const GARAGE_PRICE_SLIDER_OPTIONS = {
@@ -60,28 +65,45 @@ const GARAGE_PRICE_SLIDER_OPTIONS = {
max: [100000] max: [100000]
}, },
step: 500, step: 500,
connect: true, connect: true
tooltips: true
}; };
const AD_TYPE = { const AD_TYPE = {
AD_TYPE_SALE: "SALE", AD_TYPE_SALE: {
AD_TYPE_RENT: "RENT" id: 1,
stringId: "SALE",
title: "Prodaja"
},
AD_TYPE_RENT: {
id: 2,
stringId: "RENT",
title: "Najam"
},
AD_TYPE_REQUEST: {
id: 3,
stringId: "REQUEST",
title: "Potražnja"
}
}; };
const AD_CATEGORY = { const AD_CATEGORY = {
ALL: {
id: "ALL"
},
FLAT: { FLAT: {
id: "FLAT", id: "FLAT",
title: "Stan", title: "Stan",
hasGardenSize: false, hasGardenSize: false,
priceSliderOptions: PRICE_SLIDER_OPTIONS, priceSliderOptionsSale: PRICE_SLIDER_OPTIONS_SALE,
priceSliderOptionsRent: PRICE_SLIDER_OPTIONS_RENT,
sizeSliderOptions: HOME_SIZE_SLIDER_OPTIONS sizeSliderOptions: HOME_SIZE_SLIDER_OPTIONS
}, },
HOUSE: { HOUSE: {
id: "HOUSE", id: "HOUSE",
title: "Kuća", title: "Kuća",
hasGardenSize: true, hasGardenSize: true,
priceSliderOptions: PRICE_SLIDER_OPTIONS, priceSliderOptionsSale: PRICE_SLIDER_OPTIONS_SALE,
priceSliderOptionsRent: PRICE_SLIDER_OPTIONS_RENT,
sizeSliderOptions: HOME_SIZE_SLIDER_OPTIONS, sizeSliderOptions: HOME_SIZE_SLIDER_OPTIONS,
gardenSizeSliderOptions: GARDEN_SIZE_SLIDER_OPTIONS gardenSizeSliderOptions: GARDEN_SIZE_SLIDER_OPTIONS
}, },
@@ -89,35 +111,40 @@ const AD_CATEGORY = {
id: "OFFICE", id: "OFFICE",
title: "Kancelarija", title: "Kancelarija",
hasGardenSize: false, hasGardenSize: false,
priceSliderOptions: PRICE_SLIDER_OPTIONS, priceSliderOptionsSale: PRICE_SLIDER_OPTIONS_SALE,
priceSliderOptionsRent: PRICE_SLIDER_OPTIONS_RENT,
sizeSliderOptions: HOME_SIZE_SLIDER_OPTIONS sizeSliderOptions: HOME_SIZE_SLIDER_OPTIONS
}, },
LAND: { LAND: {
id: "LAND", id: "LAND",
title: "Zemljište", title: "Zemljište",
hasGardenSize: false, hasGardenSize: false,
priceSliderOptions: PRICE_SLIDER_OPTIONS, priceSliderOptionsSale: PRICE_SLIDER_OPTIONS_SALE,
priceSliderOptionsRent: PRICE_SLIDER_OPTIONS_RENT,
sizeSliderOptions: LAND_SIZE_SLIDER_OPTIONS sizeSliderOptions: LAND_SIZE_SLIDER_OPTIONS
}, },
APARTMENT: { APARTMENT: {
id: "APARTMENT", id: "APARTMENT",
title: "Apartman", title: "Apartman",
hasGardenSize: false, hasGardenSize: false,
priceSliderOptions: PRICE_SLIDER_OPTIONS, priceSliderOptionsSale: PRICE_SLIDER_OPTIONS_SALE,
priceSliderOptionsRent: PRICE_SLIDER_OPTIONS_RENT,
sizeSliderOptions: HOME_SIZE_SLIDER_OPTIONS sizeSliderOptions: HOME_SIZE_SLIDER_OPTIONS
}, },
GARAGE: { GARAGE: {
id: "GARAGE", id: "GARAGE",
title: "Garaža", title: "Garaža",
hasGardenSize: false, hasGardenSize: false,
priceSliderOptions: GARAGE_PRICE_SLIDER_OPTIONS, priceSliderOptionsSale: PRICE_SLIDER_OPTIONS_SALE,
priceSliderOptionsRent: PRICE_SLIDER_OPTIONS_RENT,
sizeSliderOptions: GARAGE_SIZE_SLIDER_OPTIONS sizeSliderOptions: GARAGE_SIZE_SLIDER_OPTIONS
}, },
COTTAGE: { COTTAGE: {
id: "COTTAGE", id: "COTTAGE",
title: "Vikendica", title: "Vikendica",
hasGardenSize: true, hasGardenSize: true,
priceSliderOptions: PRICE_SLIDER_OPTIONS, priceSliderOptionsSale: PRICE_SLIDER_OPTIONS_SALE,
priceSliderOptionsRent: PRICE_SLIDER_OPTIONS_RENT,
sizeSliderOptions: HOME_SIZE_SLIDER_OPTIONS, sizeSliderOptions: HOME_SIZE_SLIDER_OPTIONS,
gardenSizeSliderOptions: GARDEN_SIZE_SLIDER_OPTIONS gardenSizeSliderOptions: GARDEN_SIZE_SLIDER_OPTIONS
} }
@@ -133,14 +160,18 @@ const AD_STATUS = {
}; };
const AD_AGENCY = { const AD_AGENCY = {
OLX: "OLX" OLX: "OLX",
RENTAL: "RENTAL",
PROSTOR: "PROSTOR",
AKTIDO: "AKTIDO"
}; };
const CRAWLER_AD_TYPE = { const CRAWLER_AD_TYPE = {
NONE: 0, NONE: 0,
ALL: 1, ALL: 1,
ONLY_SELL: 2, ONLY_SELL: 2,
ONLY_RENT: 3 ONLY_RENT: 3,
ONLY_REQUEST: 4
}; };
module.exports = { module.exports = {

View File

@@ -28,6 +28,8 @@ const MAX_REAL_ESTATES_IN_EMAIL =
const MAX_REAL_ESTATES_IN_FIRST_EMAIL = const MAX_REAL_ESTATES_IN_FIRST_EMAIL =
parseInt(process.env.MAX_REAL_ESTATES_IN_FIRST_EMAIL) || 5; parseInt(process.env.MAX_REAL_ESTATES_IN_FIRST_EMAIL) || 5;
const PRINT_CRAWLER_DEBUG = process.env.PRINT_CRAWLER_DEBUG_INFO || 0;
module.exports = { module.exports = {
APP_PORT, APP_PORT,
APP_URL, APP_URL,
@@ -36,5 +38,6 @@ module.exports = {
STOP_CRAWLER, STOP_CRAWLER,
AWS_EMAIL_CONFIG, AWS_EMAIL_CONFIG,
MAX_REAL_ESTATES_IN_EMAIL, MAX_REAL_ESTATES_IN_EMAIL,
MAX_REAL_ESTATES_IN_FIRST_EMAIL MAX_REAL_ESTATES_IN_FIRST_EMAIL,
PRINT_CRAWLER_DEBUG
}; };

View File

@@ -3,11 +3,12 @@ const { isValidEmail } = require("../helpers/email");
const { const {
notifyForNewSearchRequest notifyForNewSearchRequest
} = require("../services/notificationService"); } = require("../services/notificationService");
const { AD_CATEGORY } = require("../common/enums"); const { AD_CATEGORY, AD_TYPE } = require("../common/enums");
const getQueryReviewData = searchRequest => { const getQueryReviewData = searchRequest => {
const { const {
id, id,
adType,
realEstateType, realEstateType,
sizeMin, sizeMin,
sizeMax, sizeMax,
@@ -22,8 +23,21 @@ const getQueryReviewData = searchRequest => {
? realEstateTypeObject.hasGardenSize ? realEstateTypeObject.hasGardenSize
: false; : false;
let adTypeTitle = "";
switch (adType) {
case AD_TYPE.AD_TYPE_SALE.stringId:
adTypeTitle = AD_TYPE.AD_TYPE_SALE.title;
break;
case AD_TYPE.AD_TYPE_RENT.stringId:
adTypeTitle = AD_TYPE.AD_TYPE_RENT.title;
break;
default:
adTypeTitle = "-";
break;
}
const realEstateTypeTitle = realEstateTypeObject const realEstateTypeTitle = realEstateTypeObject
? realEstateTypeObject.title ? `[${adTypeTitle}] ${realEstateTypeObject.title}`
: "-"; : "-";
const locationTitle = "Promjenite lokaciju"; const locationTitle = "Promjenite lokaciju";
@@ -122,9 +136,39 @@ const postQueryReview = async (req, res) => {
searchRequest.email = emailInput; searchRequest.email = emailInput;
searchRequest.subscribed = true; searchRequest.subscribed = true;
await searchRequest.save();
await notifyForNewSearchRequest(searchRequest); try {
await searchRequest.save();
} catch (e) {
console.log("[ERROR] Failed to save search request !", e);
console.log("Search request : ", searchRequest);
const error =
"Greška ! Nismo uspjeli kreirati zahtjev za Vašu pretragu. Molimo pokuštajte ponovo";
res.render("queryReview", {
error,
title,
queryReviewData,
email: ""
});
return;
}
try {
await notifyForNewSearchRequest(searchRequest);
} catch (e) {
console.log("[ERROR] Failed to send initial welcome email", e);
console.log("Search request : ", searchRequest);
const error =
"Greška ! Nismo uspjeli poslati email na Vašu adresu, pokušajte sa drugom email adresom";
res.render("queryReview", {
error,
title,
queryReviewData,
email: ""
});
return;
}
res.redirect(nextStep); res.redirect(nextStep);
}; };

View File

@@ -1,5 +1,5 @@
const { currentSearchRequest } = require("../helpers/url"); const { currentSearchRequest } = require("../helpers/url");
const { AD_CATEGORY } = require("../common/enums"); const { AD_CATEGORY, AD_TYPE } = require("../common/enums");
const getFilters = async (req, res) => { const getFilters = async (req, res) => {
const searchRequest = await currentSearchRequest(req); const searchRequest = await currentSearchRequest(req);
@@ -12,6 +12,7 @@ const getFilters = async (req, res) => {
const title = "Filteri za pretraživanje"; const title = "Filteri za pretraživanje";
const { const {
adType,
realEstateType, realEstateType,
priceMin, priceMin,
priceMax, priceMax,
@@ -24,11 +25,22 @@ const getFilters = async (req, res) => {
const { const {
hasGardenSize, hasGardenSize,
priceSliderOptions, priceSliderOptionsSale,
priceSliderOptionsRent,
sizeSliderOptions, sizeSliderOptions,
gardenSizeSliderOptions gardenSizeSliderOptions
} = category; } = category;
let priceSliderOptions;
if (adType === AD_TYPE.AD_TYPE_SALE.stringId) {
priceSliderOptions = Object.assign({}, priceSliderOptionsSale);
} else if (adType === AD_TYPE.AD_TYPE_RENT.stringId) {
priceSliderOptions = Object.assign({}, priceSliderOptionsRent);
} else {
res.render("notFound", { title: " " });
return;
}
if (priceMin || priceMax) { if (priceMin || priceMax) {
priceSliderOptions.start = [priceMin, priceMax]; priceSliderOptions.start = [priceMin, priceMax];
} }
@@ -61,10 +73,10 @@ const postFilters = async (req, res) => {
const nextStepPage = req.query.nextStep || "pregled"; const nextStepPage = req.query.nextStep || "pregled";
const nextStepUrl = `/${nextStepPage}/${searchRequest.id}`; const nextStepUrl = `/${nextStepPage}/${searchRequest.id}`;
const priceMin = parseInt(req.body.priceFilterMin) || 0; const priceMin = parseInt(req.body.priceMin) || 0;
const priceMax = parseInt(req.body.priceFilterMax) || 0; const priceMax = parseInt(req.body.priceMax) || 0;
const sizeMin = parseInt(req.body.sizeFilterMin) || 0; const sizeMin = parseInt(req.body.sizeMin) || 0;
const sizeMax = parseInt(req.body.sizeFilterMax) || 0; const sizeMax = parseInt(req.body.sizeMax) || 0;
//TODO: Filter validation //TODO: Filter validation
@@ -74,11 +86,11 @@ const postFilters = async (req, res) => {
searchRequest.sizeMax = sizeMax; searchRequest.sizeMax = sizeMax;
if ( if (
req.body.gardenSizeFilterMin !== undefined && req.body.gardenSizeMin !== undefined &&
req.body.gardenSizeFilterMax !== undefined req.body.gardenSizeMax !== undefined
) { ) {
const gardenSizeMin = parseInt(req.body.gardenSizeFilterMin); const gardenSizeMin = parseInt(req.body.gardenSizeMin);
const gardenSizeMax = parseInt(req.body.gardenSizeFilterMax); const gardenSizeMax = parseInt(req.body.gardenSizeMax);
//TODO: Filter validation //TODO: Filter validation

View File

@@ -1,33 +1,68 @@
const { currentSearchRequest } = require("../helpers/url"); const { currentSearchRequest } = require("../helpers/url");
const { createSearchRequest } = require("../helpers/db/searchRequest"); const { createSearchRequest } = require("../helpers/db/searchRequest");
const { AD_CATEGORY } = require("../common/enums"); const { AD_CATEGORY, AD_TYPE } = require("../common/enums");
const getRealEstateTypes = async (req, res) => {
const searchRequest = await currentSearchRequest(req);
const getRealEstateTypes = (req, res) => {
const title = "Koju nekretninu tražite?"; const title = "Koju nekretninu tražite?";
const realEstateTypes = Object.keys(AD_CATEGORY).map( let selectedAdType = AD_TYPE.AD_TYPE_SALE.id;
category => AD_CATEGORY[category] if (
); searchRequest &&
res.render("realEstateType", { realEstateTypes, title }); searchRequest.adType &&
searchRequest.adType === AD_TYPE.AD_TYPE_RENT.stringId
) {
selectedAdType = AD_TYPE.AD_TYPE_RENT.id;
}
const realEstateTypes = Object.keys(AD_CATEGORY)
.map(category => AD_CATEGORY[category])
.filter(category => category.title);
res.render("realEstateType", {
selectedAdType,
realEstateTypes,
title,
AD_TYPE
});
}; };
const postRealEstateTypes = async (req, res) => { const postRealEstateTypes = async (req, res) => {
const searchRequest = await currentSearchRequest(req); const searchRequest = await currentSearchRequest(req);
//TODO: check if selected real estate type is valid const adType = parseInt(req.body.adType);
const adTypeStringIds = {
[AD_TYPE.AD_TYPE_SALE.id]: AD_TYPE.AD_TYPE_SALE.stringId,
[AD_TYPE.AD_TYPE_RENT.id]: AD_TYPE.AD_TYPE_RENT.stringId
};
const adTypeStringId =
adTypeStringIds[adType] || AD_TYPE.AD_TYPE_SALE.stringId;
const validRealEstateTypes = Object.keys(AD_CATEGORY).filter(
category => !!AD_CATEGORY[category].title
);
const selectedRealEstateType = req.body.realEstateType || null; const selectedRealEstateType = req.body.realEstateType || null;
if (validRealEstateTypes.indexOf(selectedRealEstateType) === -1) {
res.render("notFound", { title: " " });
return;
}
const nextStepPage = req.query.nextStep || "lokacija"; const nextStepPage = req.query.nextStep || "lokacija";
let nextStepUrl = ""; let nextStepUrl = "";
if (searchRequest && searchRequest.id) { if (searchRequest && searchRequest.id) {
nextStepUrl = `/${nextStepPage}/${searchRequest.id}`; nextStepUrl = `/${nextStepPage}/${searchRequest.id}`;
searchRequest.adType = adTypeStringId;
searchRequest.realEstateType = selectedRealEstateType; searchRequest.realEstateType = selectedRealEstateType;
await searchRequest.save(); await searchRequest.save();
} else { } else {
try { try {
const newSearchRequest = await createSearchRequest({ const newSearchRequest = await createSearchRequest({
adType: adTypeStringId,
realEstateType: selectedRealEstateType realEstateType: selectedRealEstateType
}); });

View File

@@ -5,31 +5,75 @@
All environment specific configuration is read here and All environment specific configuration is read here and
passed to the crawlers and savers. passed to the crawlers and savers.
*/ */
const OlxCrawler = require("./specific/olx"); const OlxCrawler = require("./specificCrawlers/olx");
const { OLX_CONFIG } = require("./crawlerConfig"); const RentalCrawler = require("./specificCrawlers/rental");
const ProstorCrawler = require("./specificCrawlers/prostor");
const AktidoCrawler = require("./specificCrawlers/aktido");
const {
OLX_CONFIG,
RENTAL_CONFIG,
PROSTOR_CONFIG,
AKTIDO_CONFIG
} = require("./crawlerConfig");
const PostgresSaver = require("./savers/postgres"); const PostgresSaver = require("./savers/postgres");
const crawlers = [
new OlxCrawler(
[new PostgresSaver()],
OLX_CONFIG.OLX_CRAWLER_AD_TYPE,
OLX_CONFIG.OLX_CRAWLER_AD_CATEGORIES,
OLX_CONFIG.OLX_MAX_PAGES,
OLX_CONFIG.OLX_MAX_RESULTS_PER_PAGE,
OLX_CONFIG.OLX_IGNORED_USERNAMES,
OLX_CONFIG.OLX_DELAY_BETWEEN_PAGES
)
];
async function crawlAll() { async function crawlAll() {
for (let crawler of crawlers) { const postgresSaver = new PostgresSaver();
const crawlers = [
new OlxCrawler(
[postgresSaver],
OLX_CONFIG.OLX_CRAWLER_AD_TYPE,
OLX_CONFIG.OLX_CRAWLER_AD_CATEGORIES,
OLX_CONFIG.OLX_MAX_PAGES,
OLX_CONFIG.OLX_MAX_RESULTS_PER_PAGE,
OLX_CONFIG.OLX_IGNORED_USERNAMES,
OLX_CONFIG.OLX_DELAY_BETWEEN_PAGES
),
new RentalCrawler(
[postgresSaver],
RENTAL_CONFIG.RENTAL_CRAWLER_AD_TYPE,
RENTAL_CONFIG.RENTAL_CRAWLER_AD_CATEGORIES,
RENTAL_CONFIG.RENTAL_MAX_PAGES,
RENTAL_CONFIG.RENTAL_MAX_RESULTS_PER_PAGE,
RENTAL_CONFIG.RENTAL_IGNORED_USERNAMES,
RENTAL_CONFIG.RENTAL_DELAY_BETWEEN_PAGES
),
new ProstorCrawler(
[postgresSaver],
PROSTOR_CONFIG.PROSTOR_CRAWLER_AD_TYPE,
PROSTOR_CONFIG.PROSTOR_CRAWLER_AD_CATEGORIES,
PROSTOR_CONFIG.PROSTOR_MAX_PAGES,
PROSTOR_CONFIG.PROSTOR_MAX_RESULTS_PER_PAGE,
PROSTOR_CONFIG.PROSTOR_IGNORED_USERNAMES,
PROSTOR_CONFIG.PROSTOR_DELAY_BETWEEN_PAGES
),
new AktidoCrawler(
[postgresSaver],
AKTIDO_CONFIG.AKTIDO_CRAWLER_AD_TYPE,
AKTIDO_CONFIG.AKTIDO_CRAWLER_AD_CATEGORIES,
AKTIDO_CONFIG.AKTIDO_MAX_PAGES,
AKTIDO_CONFIG.AKTIDO_MAX_RESULTS_PER_PAGE,
AKTIDO_CONFIG.AKTIDO_IGNORED_USERNAMES,
AKTIDO_CONFIG.AKTIDO_DELAY_BETWEEN_PAGES
)
];
const newRealEstates = [];
for (const crawler of crawlers) {
try { try {
return await crawler.crawl(); const newRealEstatesFromSingleCrawler = await crawler.crawl();
if (Array.isArray(newRealEstatesFromSingleCrawler)) {
newRealEstates.push(...newRealEstatesFromSingleCrawler);
}
} catch (e) { } catch (e) {
console.log("Error crawling. Trying next crawler! ", e); console.log("Error crawling. Trying next crawler! ", e);
return [];
} }
} }
return newRealEstates;
} }
module.exports = { module.exports = {

View File

@@ -1,42 +1,14 @@
"use strict"; "use strict";
require("dotenv").config({ path: __dirname + "/./../../.env" }); require("dotenv").config({ path: __dirname + "/./../../.env" });
const { CRAWLER_AD_TYPE, AD_CATEGORY } = require("../common/enums");
const olxCrawlerAdType = const OLX_CONFIG = require("./specificConfigs/olx");
process.env.OLX_CRAWLER_AD_TYPE !== undefined const RENTAL_CONFIG = require("./specificConfigs/rental");
? CRAWLER_AD_TYPE[process.env.OLX_CRAWLER_AD_TYPE] const PROSTOR_CONFIG = require("./specificConfigs/prostor");
: null; const AKTIDO_CONFIG = require("./specificConfigs/aktido");
const olxParsedCrawlerAdCategories =
process.env.OLX_CRAWLER_AD_CATEGORIES !== undefined
? process.env.OLX_CRAWLER_AD_CATEGORIES.split(",").map(category =>
category.trim()
)
: ["FLAT", "HOUSE"];
const olxIgnoredUsernames =
process.env.OLX_IGNORED_USERNAMES !== undefined
? process.env.OLX_IGNORED_USERNAMES.split(",").map(username =>
username.trim()
)
: [];
const transformedCrawlerAdCategories = olxParsedCrawlerAdCategories
.map(categoryName =>
AD_CATEGORY[categoryName] ? AD_CATEGORY[categoryName].id : undefined
)
.filter(category => !!category);
const OLX_CONFIG = {
OLX_MAX_PAGES: parseInt(process.env.OLX_MAX_PAGES) || 500,
OLX_MAX_RESULTS_PER_PAGE:
parseInt(process.env.OLX_MAX_RESULTS_PER_PAGE) || 50,
OLX_CRAWLER_AD_TYPE: olxCrawlerAdType || CRAWLER_AD_TYPE.NONE,
OLX_CRAWLER_AD_CATEGORIES: transformedCrawlerAdCategories,
OLX_IGNORED_USERNAMES: olxIgnoredUsernames || [],
OLX_DELAY_BETWEEN_PAGES: parseInt(process.env.OLX_DELAY_BETWEEN_PAGES) || 1000
};
module.exports = { module.exports = {
OLX_CONFIG OLX_CONFIG,
RENTAL_CONFIG,
PROSTOR_CONFIG,
AKTIDO_CONFIG
}; };

View File

@@ -0,0 +1,34 @@
"use strict";
const { CRAWLER_AD_TYPE, AD_CATEGORY } = require("../../common/enums");
const aktidoCrawlerAdType =
process.env.AKTIDO_CRAWLER_AD_TYPE !== undefined
? CRAWLER_AD_TYPE[process.env.AKTIDO_CRAWLER_AD_TYPE]
: null;
const aktidoParsedCrawlerAdCategories =
process.env.AKTIDO_CRAWLER_AD_CATEGORIES !== undefined
? process.env.AKTIDO_CRAWLER_AD_CATEGORIES.split(",").map(category =>
category.trim()
)
: ["FLAT", "HOUSE"];
const aktidoIgnoredUsernames = [];
const transformedAktidoCrawlerAdCategories = aktidoParsedCrawlerAdCategories
.map(categoryName =>
AD_CATEGORY[categoryName] ? AD_CATEGORY[categoryName].id : undefined
)
.filter(category => !!category);
module.exports = {
AKTIDO_MAX_PAGES: parseInt(process.env.AKTIDO_MAX_PAGES) || 500,
AKTIDO_MAX_RESULTS_PER_PAGE:
parseInt(process.env.AKTIDO_MAX_RESULTS_PER_PAGE) || 50,
AKTIDO_CRAWLER_AD_TYPE: aktidoCrawlerAdType || CRAWLER_AD_TYPE.NONE,
AKTIDO_CRAWLER_AD_CATEGORIES: transformedAktidoCrawlerAdCategories,
AKTIDO_IGNORED_USERNAMES: aktidoIgnoredUsernames || [],
AKTIDO_DELAY_BETWEEN_PAGES:
parseInt(process.env.AKTIDO_DELAY_BETWEEN_PAGES) || 1000,
AKTIDO_FORCE_CRAWL: !!parseInt(process.env.AKTIDO_FORCE_CRAWL)
};

View File

@@ -0,0 +1,39 @@
"use strict";
const { CRAWLER_AD_TYPE, AD_CATEGORY } = require("../../common/enums");
const olxCrawlerAdType =
process.env.OLX_CRAWLER_AD_TYPE !== undefined
? CRAWLER_AD_TYPE[process.env.OLX_CRAWLER_AD_TYPE]
: null;
const olxParsedCrawlerAdCategories =
process.env.OLX_CRAWLER_AD_CATEGORIES !== undefined
? process.env.OLX_CRAWLER_AD_CATEGORIES.split(",").map(category =>
category.trim()
)
: ["FLAT", "HOUSE"];
const olxIgnoredUsernames =
process.env.OLX_IGNORED_USERNAMES !== undefined
? process.env.OLX_IGNORED_USERNAMES.split(",").map(username =>
username.trim()
)
: [];
const transformedOlxCrawlerAdCategories = olxParsedCrawlerAdCategories
.map(categoryName =>
AD_CATEGORY[categoryName] ? AD_CATEGORY[categoryName].id : undefined
)
.filter(category => !!category);
module.exports = {
OLX_MAX_PAGES: parseInt(process.env.OLX_MAX_PAGES) || 500,
OLX_MAX_RESULTS_PER_PAGE:
parseInt(process.env.OLX_MAX_RESULTS_PER_PAGE) || 50,
OLX_CRAWLER_AD_TYPE: olxCrawlerAdType || CRAWLER_AD_TYPE.NONE,
OLX_CRAWLER_AD_CATEGORIES: transformedOlxCrawlerAdCategories,
OLX_IGNORED_USERNAMES: olxIgnoredUsernames || [],
OLX_DELAY_BETWEEN_PAGES:
parseInt(process.env.OLX_DELAY_BETWEEN_PAGES) || 1000,
OLX_FORCE_CRAWL: !!parseInt(process.env.OLX_FORCE_CRAWL)
};

View File

@@ -0,0 +1,33 @@
"use strict";
const { CRAWLER_AD_TYPE, AD_CATEGORY } = require("../../common/enums");
const prostorCrawlerAdType =
process.env.PROSTOR_CRAWLER_AD_TYPE !== undefined
? CRAWLER_AD_TYPE[process.env.PROSTOR_CRAWLER_AD_TYPE]
: null;
const prostorParsedCrawlerAdCategories =
process.env.PROSTOR_CRAWLER_AD_CATEGORIES !== undefined
? process.env.PROSTOR_CRAWLER_AD_CATEGORIES.split(",").map(category =>
category.trim()
)
: ["FLAT", "HOUSE"];
const prostorIgnoredUsernames = [];
const transformedProstorCrawlerAdCategories = prostorParsedCrawlerAdCategories
.map(categoryName =>
AD_CATEGORY[categoryName] ? AD_CATEGORY[categoryName].id : undefined
)
.filter(category => !!category);
module.exports = {
PROSTOR_MAX_PAGES: parseInt(process.env.PROSTOR_MAX_PAGES) || 100,
PROSTOR_MAX_RESULTS_PER_PAGE:
parseInt(process.env.PROSTOR_MAX_RESULTS_PER_PAGE) || 5000,
PROSTOR_CRAWLER_AD_TYPE: prostorCrawlerAdType || CRAWLER_AD_TYPE.NONE,
PROSTOR_CRAWLER_AD_CATEGORIES: transformedProstorCrawlerAdCategories,
PROSTOR_IGNORED_USERNAMES: prostorIgnoredUsernames || [],
PROSTOR_DELAY_BETWEEN_PAGES:
parseInt(process.env.PROSTOR_DELAY_BETWEEN_PAGES) || 1000
};

View File

@@ -0,0 +1,34 @@
"use strict";
const { CRAWLER_AD_TYPE, AD_CATEGORY } = require("../../common/enums");
const rentalCrawlerAdType =
process.env.RENTAL_CRAWLER_AD_TYPE !== undefined
? CRAWLER_AD_TYPE[process.env.RENTAL_CRAWLER_AD_TYPE]
: null;
const rentalParsedCrawlerAdCategories =
process.env.RENTAL_CRAWLER_AD_CATEGORIES !== undefined
? process.env.RENTAL_CRAWLER_AD_CATEGORIES.split(",").map(category =>
category.trim()
)
: ["FLAT", "HOUSE"];
const rentalIgnoredUsernames = [];
const transformedRentalCrawlerAdCategories = rentalParsedCrawlerAdCategories
.map(categoryName =>
AD_CATEGORY[categoryName] ? AD_CATEGORY[categoryName].id : undefined
)
.filter(category => !!category);
module.exports = {
RENTAL_MAX_PAGES: parseInt(process.env.RENTAL_MAX_PAGES) || 500,
RENTAL_MAX_RESULTS_PER_PAGE:
parseInt(process.env.RENTAL_MAX_RESULTS_PER_PAGE) || 50,
RENTAL_CRAWLER_AD_TYPE: rentalCrawlerAdType || CRAWLER_AD_TYPE.NONE,
RENTAL_CRAWLER_AD_CATEGORIES: transformedRentalCrawlerAdCategories,
RENTAL_IGNORED_USERNAMES: rentalIgnoredUsernames || [],
RENTAL_DELAY_BETWEEN_PAGES:
parseInt(process.env.RENTAL_DELAY_BETWEEN_PAGES) || 1000,
RENTAL_FORCE_CRAWL: !!parseInt(process.env.RENTAL_FORCE_CRAWL)
};

View File

@@ -0,0 +1,370 @@
"use strict";
const fetch = require("node-fetch");
const cheerio = require("cheerio");
const Promise = require("bluebird");
const moment = require("moment-timezone");
const htmlToText = require("html-to-text");
const {
AD_TYPE,
AD_CATEGORY,
AD_AGENCY,
AD_STATUS,
CRAWLER_AD_TYPE
} = require("../../common/enums");
const {
DEFAULT_TIMEZONE,
PRINT_CRAWLER_DEBUG
} = require("../../config/appConfig");
const AKTIDO_ENUMS = {
AKTIDO_AD_TYPE: {
[CRAWLER_AD_TYPE.ALL]: "/prodaja-1/najam-2",
[CRAWLER_AD_TYPE.ONLY_SELL]: "/prodaja-1",
[CRAWLER_AD_TYPE.ONLY_RENT]: "/najam-2"
},
AKTIDO_AD_CATEGORY: {
[AD_CATEGORY.ALL.id]: "",
[AD_CATEGORY.FLAT.id]: "/tip-2",
[AD_CATEGORY.HOUSE.id]: "/tip-1",
[AD_CATEGORY.LAND.id]: "/tip-5",
[AD_CATEGORY.OFFICE.id]: "/tip-4",
[AD_CATEGORY.APARTMENT.id]: "/tip-3",
[AD_CATEGORY.GARAGE.id]: "/tip-6"
//[AD_CATEGORY.COTTAGE.id]: ""
},
AKTIDO_PUBLISHED_DATE_FORMAT: "YYYY-MM-DD HH:mm:ss",
AKTIDO_RENEWED_DATE_FORMAT: "YYYY-MM-DD u HH:mm:ss"
};
const { AKTIDO_FORCE_CRAWL } = require("../specificConfigs/aktido");
class AktidoCrawler {
constructor(
savers = [],
crawlerAdTypes = CRAWLER_AD_TYPE.ALL,
crawlerAdCategories = [AD_CATEGORY.FLAT, AD_CATEGORY.HOUSE],
maxPages = 1000,
maxResultsPerPage = 100,
ignoredUsernames = [],
delayBetweenPages = 1000
) {
this.savers = savers;
this.baseUrl = "https://www.aktido.ba/pretraga/sortiraj-date_DESC";
this.crawlerAdTypes = crawlerAdTypes;
this.crawlerAdCategories = crawlerAdCategories;
this.maxPages = maxPages;
this.maxResultsPerPage = maxResultsPerPage;
this.delayBetweenPages = delayBetweenPages;
}
async crawl() {
const crawlAdCategories = this.crawlerAdCategories;
const newRealEstates = [];
if (crawlAdCategories) {
const indexGenerators = [];
for (const adCategory of crawlAdCategories) {
indexGenerators.push(this.categoryIndexer(adCategory));
}
let done = false;
while (!done) {
const categoryIndexerPromises = [];
const generatorsToRemove = [];
for (const indexGenerator of indexGenerators) {
categoryIndexerPromises.push(indexGenerator.next());
generatorsToRemove.push(false);
}
const singlePageResults = await Promise.all(categoryIndexerPromises);
const entries = singlePageResults.entries();
for (const [index, { value: singlePageResult }] of entries) {
if (singlePageResult) {
const saveResults = await this.saveCrawledResults(singlePageResult);
const { newRecords } = saveResults;
newRealEstates.push(...newRecords);
if (
Array.isArray(newRecords) &&
newRecords.length === 0 &&
!AKTIDO_FORCE_CRAWL
) {
generatorsToRemove[index] = true;
}
} else {
//Generator returned undefined, remove this generator from array
generatorsToRemove[index] = true;
// console.log("Generator ", index + 1, "has no more pages");
}
}
// console.log("Generators state : ", generatorsToRemove);
for (let i = generatorsToRemove.length - 1; i >= 0; i--) {
if (generatorsToRemove[i]) {
// console.log("\tRemove generator ", i + 1);
indexGenerators.splice(i, 1);
}
}
if (indexGenerators.length === 0) {
done = true;
}
await this.sleep(this.delayBetweenPages);
}
}
return newRealEstates;
}
async *categoryIndexer(adCategory) {
let pageToIndex = 1;
const urlAdTypePart = AKTIDO_ENUMS.AKTIDO_AD_TYPE[this.crawlerAdTypes];
const urlCategoryPart = AKTIDO_ENUMS.AKTIDO_AD_CATEGORY[adCategory];
if (urlAdTypePart !== undefined && urlCategoryPart !== undefined) {
while (true) {
const urlPageToCrawl = `${this.baseUrl}${urlAdTypePart}${urlCategoryPart}/stranica-${pageToIndex}`;
const singlePageResults = await this.indexSinglePage(
urlPageToCrawl,
this.maxResultsPerPage
);
if (Array.isArray(singlePageResults) && singlePageResults.length > 0) {
yield singlePageResults;
} else {
return undefined;
}
++pageToIndex;
if (pageToIndex === this.maxPages) {
return undefined;
}
}
} else {
return undefined;
}
}
async indexSinglePage(url, maxResultsPerPage) {
if (PRINT_CRAWLER_DEBUG) {
console.log("[AKTIDO] Index page : ", url);
}
try {
const res = await fetch(url);
const body = await res.text();
const $ = cheerio.load(body);
let hrefs = [];
$(
"body > div > div.container > div.row > div.col-xs-12.col-sm-12.col-md-12.col-lg-9.content-main > div.row.box-items.group-grid-view"
)
.find(".moreInfo")
.each((i, elem) => {
const href = $(elem)
.find("a")
.first()
.attr("href");
if (href) {
hrefs.push(href);
}
});
let actualNoOfResults =
hrefs.length <= maxResultsPerPage ? hrefs.length : maxResultsPerPage;
const asyncScraping = [];
for (let i = 0; i < actualNoOfResults; i++) {
asyncScraping.push(this.scrapeAd(hrefs[i]));
}
const scrapedData = await Promise.all(asyncScraping);
const filteredScrapedData = scrapedData.filter(adData => !!adData);
return filteredScrapedData;
} catch (e) {
console.error("[AKTIDO] Exception caught:" + e);
return [];
}
}
async scrapeAd(url) {
// console.log("[AKTIDO] Scraping : ", url);
try {
const adPageSource = await fetch(url);
const body = await adPageSource.text();
const $ = cheerio.load(body);
const mapElementParent = $(".box-map").parent();
const scriptElement = $("script", mapElementParent);
if (
scriptElement[0] &&
scriptElement[0].children &&
scriptElement[0].children[0] &&
scriptElement[0].children[0].data
) {
let extractedData;
try {
//data string starts with : var json_map_data = [{"r ...
//so we remove first 20 characters
const jsonData = scriptElement[0].children[0].data.substring(20);
const parsedJsonData = JSON.parse(jsonData);
extractedData = parsedJsonData[0];
} catch (e) {
throw { message: "Can't find ad data JSON" };
}
const aktidoId = extractedData["re_realEstates_id"];
const adCategory = this.getKiviCategoryIdFromAktidoId(
parseInt(extractedData["re_types_id"])
);
if (!adCategory) {
throw {
message: `Invalid category : ${extractedData["re_types_id"]}`
};
}
const adType = this.getKiviAdTypeFromAktidoActionId(
parseInt(extractedData["re_action_id"])
);
if (!adType) {
throw {
message: `Invalid ad type : ${extractedData["re_action_id"]}`
};
}
const title = extractedData["re_realEstates_portalName"];
const extractedPrice = parseFloat(
extractedData["re_realEstates_price"]
);
const price = extractedPrice ? extractedPrice : null;
const area = parseFloat(extractedData["re_realEstates_area"]);
const gardenSize = parseFloat(
extractedData["re_realEstates_fieldArea"]
);
const longDescription = htmlToText.fromString(
extractedData["re_realEstates_description"]
);
const locationLong = extractedData["re_realEstates_longitude"];
const locationLat = extractedData["re_realEstates_latitude"];
const publishedDateMoment = moment.tz(
extractedData["re_realEstates_inserted"],
AKTIDO_ENUMS.AKTIDO_PUBLISHED_DATE_FORMAT,
DEFAULT_TIMEZONE
);
if (!publishedDateMoment.isValid()) {
throw {
message: `Invalid published date : ${
extractedData["re_realEstates_inserted"]
}`
};
}
const renewedDateMoment = moment.tz(
extractedData["re_realEstates_edited"],
AKTIDO_ENUMS.AKTIDO_RENEWED_DATE_FORMAT,
DEFAULT_TIMEZONE
);
if (!renewedDateMoment.isValid()) {
throw {
message: `Invalid renewed date : ${
extractedData["re_realEstates_edited"]
}`
};
}
const adStatus = AD_STATUS.STATUS_NORMAL;
const data = {
url,
agencyObjectId: aktidoId,
originAgencyName: AD_AGENCY.AKTIDO,
realEstateType: adCategory,
adType,
title,
price,
area,
gardenSize,
shortDescription: "",
longDescription: longDescription,
streetNumber: 0,
streetName: "",
locality: "",
municipality: "",
city: "",
region: "",
entity: "",
country: "",
locationLat,
locationLong,
adStatus,
publishedDate: publishedDateMoment.toISOString(),
renewedDate: renewedDateMoment.toISOString()
};
return data;
} else {
console.log("[AKTIDO] No JSON data for this ad : ", url);
return null;
}
} catch (e) {
console.error("[AKTIDO] Exception caught: " + e.message, "\r\nURL:", url);
return null;
}
return null;
}
//======= HELPER FUNCTIONS =============
getKiviCategoryIdFromAktidoId(aktidoCategoryId) {
switch (aktidoCategoryId) {
case 1:
return AD_CATEGORY.HOUSE.id;
case 2:
return AD_CATEGORY.FLAT.id;
case 3:
return AD_CATEGORY.APARTMENT.id;
case 4:
return AD_CATEGORY.OFFICE.id;
case 5:
return AD_CATEGORY.LAND.id;
case 6:
return AD_CATEGORY.GARAGE.id;
default:
return undefined;
}
}
getKiviAdTypeFromAktidoActionId(actionId) {
switch (actionId) {
case 1:
return AD_TYPE.AD_TYPE_SALE.stringId;
case 2:
return AD_TYPE.AD_TYPE_RENT.stringId;
default:
return undefined;
}
}
async sleep(ms) {
return new Promise(resolve => setTimeout(resolve, ms));
}
async saveCrawledResults(results) {
const savers = this.savers;
// for (const saver of savers) {
// await saver.save(results);
// }
//For now, we use only Postgres saver, so ...
return await savers[0].save(results);
//so that we can use some sequelize options and information when data is inserted
}
}
module.exports = AktidoCrawler;

View File

@@ -13,13 +13,17 @@ const {
CRAWLER_AD_TYPE CRAWLER_AD_TYPE
} = require("../../common/enums"); } = require("../../common/enums");
const { DEFAULT_TIMEZONE } = require("../../config/appConfig"); const {
DEFAULT_TIMEZONE,
PRINT_CRAWLER_DEBUG
} = require("../../config/appConfig");
const OLX_ENUMS = { const OLX_ENUMS = {
OLX_AD_TYPE: { OLX_AD_TYPE: {
[CRAWLER_AD_TYPE.ALL]: "", [CRAWLER_AD_TYPE.ALL]: "",
[CRAWLER_AD_TYPE.ONLY_SELL]: "&vrsta=samoprodaja", [CRAWLER_AD_TYPE.ONLY_SELL]: "&vrsta=samoprodaja",
[CRAWLER_AD_TYPE.ONLY_RENT]: "&vrsta=samoizdavanje" [CRAWLER_AD_TYPE.ONLY_RENT]: "&vrsta=samoizdavanje",
[CRAWLER_AD_TYPE.ONLY_REQUEST]: "&vrsta=samopotraznja"
}, },
OLX_AD_CATEGORY: { OLX_AD_CATEGORY: {
[AD_CATEGORY.FLAT.id]: "&kategorija=23", [AD_CATEGORY.FLAT.id]: "&kategorija=23",
@@ -35,6 +39,8 @@ const OLX_ENUMS = {
OLX_RENEWED_DATE_FORMAT: "DD.MM.YYYY. u HH:mm" OLX_RENEWED_DATE_FORMAT: "DD.MM.YYYY. u HH:mm"
}; };
const { OLX_FORCE_CRAWL } = require("../specificConfigs/olx");
class OlxCrawler { class OlxCrawler {
constructor( constructor(
savers = [], savers = [],
@@ -96,7 +102,7 @@ class OlxCrawler {
"minute" "minute"
); );
if (stopCrawlingThisCategory) { if (stopCrawlingThisCategory && !OLX_FORCE_CRAWL) {
generatorsToRemove[index] = true; generatorsToRemove[index] = true;
// console.log("\tGenerator ", index + 1, "has no more new ads"); // console.log("\tGenerator ", index + 1, "has no more new ads");
break; break;
@@ -131,7 +137,7 @@ class OlxCrawler {
const urlAdTypePart = OLX_ENUMS.OLX_AD_TYPE[this.crawlerAdTypes]; const urlAdTypePart = OLX_ENUMS.OLX_AD_TYPE[this.crawlerAdTypes];
const urlCategoryPart = OLX_ENUMS.OLX_AD_CATEGORY[adCategory]; const urlCategoryPart = OLX_ENUMS.OLX_AD_CATEGORY[adCategory];
if (urlAdTypePart && urlCategoryPart) { if (urlAdTypePart !== undefined && urlCategoryPart !== undefined) {
while (true) { while (true) {
const urlPageToCrawl = `${this.baseUrl}${urlAdTypePart}${urlCategoryPart}&stranica=${pageToIndex}`; const urlPageToCrawl = `${this.baseUrl}${urlAdTypePart}${urlCategoryPart}&stranica=${pageToIndex}`;
const singlePageResults = await this.indexSinglePage( const singlePageResults = await this.indexSinglePage(
@@ -156,6 +162,10 @@ class OlxCrawler {
} }
async indexSinglePage(url, maxResultsPerPage) { async indexSinglePage(url, maxResultsPerPage) {
if (PRINT_CRAWLER_DEBUG) {
console.log("[OLX] Index page : ", url);
}
try { try {
const res = await fetch(url); const res = await fetch(url);
const body = await res.text(); const body = await res.text();
@@ -205,7 +215,7 @@ class OlxCrawler {
title: "#naslovartikla", title: "#naslovartikla",
descriptions: ".artikal_detaljniopis_tekst", descriptions: ".artikal_detaljniopis_tekst",
category: category:
"#artikal_glavni_div > div.artikal_lijevo > div:nth-child(3) > div > span:nth-child(3) > a > span" "#artikal_glavni_div > div.artikal_lijevo > div.artikal_kat > div > span:nth-child(3) > a > span"
}; };
const username = $(propertySelectors.username) const username = $(propertySelectors.username)
@@ -377,7 +387,7 @@ class OlxCrawler {
//========================================= //=========================================
const parsedCategory = this.getAdCategoryId(category); const parsedCategory = this.getAdCategoryId(category);
if (!parsedCategory) { if (!parsedCategory) {
throw { message: "Unknown ad category" }; throw { message: `Unknown ad category [${category}]` };
} }
const parsedAdType = this.getAdTypeId(adType); const parsedAdType = this.getAdTypeId(adType);
@@ -465,9 +475,11 @@ class OlxCrawler {
getAdTypeId(adTypeText) { getAdTypeId(adTypeText) {
switch (adTypeText) { switch (adTypeText) {
case "Prodaja": case "Prodaja":
return AD_TYPE.AD_TYPE_SALE; return AD_TYPE.AD_TYPE_SALE.stringId;
case "Izdavanje": case "Izdavanje":
return AD_TYPE.AD_TYPE_RENT; return AD_TYPE.AD_TYPE_RENT.stringId;
case "Potražnja":
return AD_TYPE.AD_TYPE_RENT.stringId;
default: default:
return undefined; return undefined;
} }

View File

@@ -0,0 +1,252 @@
"use strict";
const fetch = require("node-fetch");
const cheerio = require("cheerio");
const {
AD_TYPE,
AD_CATEGORY,
AD_AGENCY,
AD_STATUS,
CRAWLER_AD_TYPE
} = require("../../common/enums");
const { PRINT_CRAWLER_DEBUG } = require("../../config/appConfig");
const PROSTOR_ENUMS = {
PROSTOR_AD_TYPE: {
[CRAWLER_AD_TYPE.ALL]: "&action=0",
[CRAWLER_AD_TYPE.ONLY_SELL]: "&action=1",
[CRAWLER_AD_TYPE.ONLY_RENT]: "&action=2"
},
PROSTOR_AD_CATEGORY: {
[AD_CATEGORY.ALL.id]: "",
[AD_CATEGORY.FLAT.id]: "&type=7",
[AD_CATEGORY.HOUSE.id]: "&type=8",
[AD_CATEGORY.LAND.id]: "&type=10",
[AD_CATEGORY.OFFICE.id]: "&type=9",
[AD_CATEGORY.APARTMENT.id]: "&type=11",
[AD_CATEGORY.GARAGE.id]: "&type=14"
//[AD_CATEGORY.COTTAGE.id]: ""
},
PROSTOR_PUBLISHED_DATE_FORMAT: "YYYY-MM-DD HH:mm:ss",
PROSTOR_RENEWED_DATE_FORMAT: "YYYY-MM-DD u HH:mm:ss"
};
class ProstorCrawler {
constructor(
savers = [],
crawlerAdTypes = CRAWLER_AD_TYPE.ALL,
crawlerAdCategories = [AD_CATEGORY.FLAT, AD_CATEGORY.HOUSE],
maxPages = 5000,
maxResultsPerPage = 5000,
ignoredUsernames = [],
delayBetweenPages = 1000
) {
this.savers = savers;
this.baseUrl = "https://prostor.ba/pretraga";
this.crawlerAdTypes = crawlerAdTypes;
this.crawlerAdCategories = crawlerAdCategories;
this.maxResultsPerPage = maxResultsPerPage;
}
async crawl() {
const crawlAdCategories = this.crawlerAdCategories;
const newRealEstates = [];
if (crawlAdCategories) {
for (const adCategory of crawlAdCategories) {
const urlAdTypePart =
PROSTOR_ENUMS.PROSTOR_AD_TYPE[this.crawlerAdTypes];
const urlCategoryPart = PROSTOR_ENUMS.PROSTOR_AD_CATEGORY[adCategory];
if (urlAdTypePart !== undefined && urlCategoryPart !== undefined) {
const urlPageToCrawl = `${this.baseUrl}?remove_sold=1${urlAdTypePart}${urlCategoryPart}`;
const singleCategoryResults = await this.extractRealEstates(
urlPageToCrawl
);
const resultsSubset = singleCategoryResults.slice(
0,
this.maxResultsPerPage
);
const saveResults = await this.saveCrawledResults(resultsSubset);
const { newRecords } = saveResults;
newRealEstates.push(...newRecords);
}
}
}
return newRealEstates;
}
async extractRealEstates(url) {
if (PRINT_CRAWLER_DEBUG) {
console.log("[PROSTOR] Index page : ", url);
}
try {
const res = await fetch(url);
const body = await res.text();
const $ = cheerio.load(body);
const scriptElement = $(
"body > div > div.container-fluid > script:nth-child(7)"
);
if (
scriptElement[0] &&
scriptElement[0].children &&
scriptElement[0].children[0] &&
scriptElement[0].children[0].data
) {
const scriptData = scriptElement[0].children[0].data;
try {
// script element data contains JS code and we need to extract only data for realEstates
// data string starts with : var map; var markers = [{"r ...
// so we remove first 23 characters
//
// real estate JSON data ends with ...}, ]; map = new...
// so we need to find index of that substring to know where to stop
// we will NOT include trailing comma because it breaks JSON parse, so we have to close ] bracket manually
const jsonEndIndex = scriptData.indexOf(", ]; map = new");
if (jsonEndIndex > -1) {
const jsonData = scriptData.substring(23, jsonEndIndex) + "]";
const realEstates = JSON.parse(jsonData);
const transformedRealEstates = [];
for (const realEstate of realEstates) {
const transformedRealEstate = ProstorCrawler.transformRealEstateData(
realEstate
);
if (transformedRealEstate) {
transformedRealEstates.push(transformedRealEstate);
}
}
return transformedRealEstates;
} else {
throw {
message: "Something is wrong with JSON data or data is moved"
};
}
} catch (e) {
console.log(e);
throw { message: "Can't find ad data JSON" };
}
}
} catch (e) {
console.error("[PROSTOR] Exception caught:", e.message);
return [];
}
}
static transformRealEstateData(realEstateData) {
try {
const { lat, lng, property_name, price, size, link } = realEstateData;
// link contains part of the URL in the format of : /prodaja/stan/stup/9556
// general form is : /actionType/realEstateType/location/realEstateID
// linkParts contains : ['', 'actionType', 'realEstateType', 'location', 'realEstateID']
const linkParts = link.split("/");
const adType = ProstorCrawler.getAdTypeId(linkParts[1]);
const realEstateType = ProstorCrawler.getAdCategoryId(linkParts[2]);
const prostorId = linkParts[4];
const url = `https://prostor.ba${link}`;
if (!adType || !realEstateType || !prostorId) {
return null;
}
const adStatus = AD_STATUS.STATUS_NORMAL;
const parsedPrice = parseFloat(price.replace(/\./g, "")) || null;
const parsedArea = parseFloat(size);
const data = {
url,
agencyObjectId: prostorId,
originAgencyName: AD_AGENCY.PROSTOR,
realEstateType,
adType,
title: property_name,
price: parsedPrice,
area: parsedArea,
gardenSize: null,
shortDescription: "",
longDescription: "",
streetNumber: 0,
streetName: "",
locality: "",
municipality: "",
city: "",
region: "",
entity: "",
country: "",
locationLat: lat,
locationLong: lng,
adStatus,
publishedDate: null,
renewedDate: null
};
return data;
} catch (e) {
console.error(
"[PROSTOR] Exception caught: " + e.message,
"\r\nURL:",
url
);
return null;
}
}
//======= HELPER FUNCTIONS =============
static getAdCategoryId(categoryText) {
switch (categoryText) {
case "stan":
return AD_CATEGORY.FLAT.id;
case "kuca":
return AD_CATEGORY.HOUSE.id;
case "apartman":
return AD_CATEGORY.APARTMENT.id;
case "poslovni-prostor":
return AD_CATEGORY.OFFICE.id;
case "garaza":
return AD_CATEGORY.GARAGE.id;
case "zemljiste":
return AD_CATEGORY.LAND.id;
default:
return undefined;
}
}
static getAdTypeId(adTypeText) {
switch (adTypeText) {
case "prodaja":
return AD_TYPE.AD_TYPE_SALE.stringId;
case "najam":
return AD_TYPE.AD_TYPE_RENT.stringId;
default:
return undefined;
}
}
async saveCrawledResults(results) {
const savers = this.savers;
// for (const saver of savers) {
// await saver.save(results);
// }
//For now, we use only Postgres saver, so ...
return await savers[0].save(results);
//so that we can use some sequelize options and information when data is inserted
}
}
module.exports = ProstorCrawler;

View File

@@ -0,0 +1,370 @@
"use strict";
const fetch = require("node-fetch");
const cheerio = require("cheerio");
const Promise = require("bluebird");
const moment = require("moment-timezone");
const htmlToText = require("html-to-text");
const {
AD_TYPE,
AD_CATEGORY,
AD_AGENCY,
AD_STATUS,
CRAWLER_AD_TYPE
} = require("../../common/enums");
const {
DEFAULT_TIMEZONE,
PRINT_CRAWLER_DEBUG
} = require("../../config/appConfig");
const RENTAL_ENUMS = {
RENTAL_AD_TYPE: {
[CRAWLER_AD_TYPE.ALL]: "/prodaja-1/najam-2",
[CRAWLER_AD_TYPE.ONLY_SELL]: "/prodaja-1",
[CRAWLER_AD_TYPE.ONLY_RENT]: "/najam-2"
},
RENTAL_AD_CATEGORY: {
[AD_CATEGORY.ALL.id]: "",
[AD_CATEGORY.FLAT.id]: "/tip-2",
[AD_CATEGORY.HOUSE.id]: "/tip-1",
[AD_CATEGORY.LAND.id]: "/tip-5",
[AD_CATEGORY.OFFICE.id]: "/tip-4",
[AD_CATEGORY.APARTMENT.id]: "/tip-3",
[AD_CATEGORY.GARAGE.id]: "/tip-6"
//[AD_CATEGORY.COTTAGE.id]: ""
},
RENTAL_PUBLISHED_DATE_FORMAT: "YYYY-MM-DD HH:mm:ss",
RENTAL_RENEWED_DATE_FORMAT: "YYYY-MM-DD u HH:mm:ss"
};
const { RENTAL_FORCE_CRAWL } = require("../specificConfigs/rental");
class RentalCrawler {
constructor(
savers = [],
crawlerAdTypes = CRAWLER_AD_TYPE.ALL,
crawlerAdCategories = [AD_CATEGORY.FLAT, AD_CATEGORY.HOUSE],
maxPages = 1000,
maxResultsPerPage = 100,
ignoredUsernames = [],
delayBetweenPages = 1000
) {
this.savers = savers;
this.baseUrl = "https://www.rental.ba/pretraga/sortiraj-date_DESC";
this.crawlerAdTypes = crawlerAdTypes;
this.crawlerAdCategories = crawlerAdCategories;
this.maxPages = maxPages;
this.maxResultsPerPage = maxResultsPerPage;
this.delayBetweenPages = delayBetweenPages;
}
async crawl() {
const crawlAdCategories = this.crawlerAdCategories;
const newRealEstates = [];
if (crawlAdCategories) {
const indexGenerators = [];
for (const adCategory of crawlAdCategories) {
indexGenerators.push(this.categoryIndexer(adCategory));
}
let done = false;
while (!done) {
const categoryIndexerPromises = [];
const generatorsToRemove = [];
for (const indexGenerator of indexGenerators) {
categoryIndexerPromises.push(indexGenerator.next());
generatorsToRemove.push(false);
}
const singlePageResults = await Promise.all(categoryIndexerPromises);
const entries = singlePageResults.entries();
for (const [index, { value: singlePageResult }] of entries) {
if (singlePageResult) {
const saveResults = await this.saveCrawledResults(singlePageResult);
const { newRecords } = saveResults;
newRealEstates.push(...newRecords);
if (
Array.isArray(newRecords) &&
newRecords.length === 0 &&
!RENTAL_FORCE_CRAWL
) {
generatorsToRemove[index] = true;
}
} else {
//Generator returned undefined, remove this generator from array
generatorsToRemove[index] = true;
// console.log("Generator ", index + 1, "has no more pages");
}
}
// console.log("Generators state : ", generatorsToRemove);
for (let i = generatorsToRemove.length - 1; i >= 0; i--) {
if (generatorsToRemove[i]) {
// console.log("\tRemove generator ", i + 1);
indexGenerators.splice(i, 1);
}
}
if (indexGenerators.length === 0) {
done = true;
}
await this.sleep(this.delayBetweenPages);
}
}
return newRealEstates;
}
async *categoryIndexer(adCategory) {
let pageToIndex = 1;
const urlAdTypePart = RENTAL_ENUMS.RENTAL_AD_TYPE[this.crawlerAdTypes];
const urlCategoryPart = RENTAL_ENUMS.RENTAL_AD_CATEGORY[adCategory];
if (urlAdTypePart !== undefined && urlCategoryPart !== undefined) {
while (true) {
const urlPageToCrawl = `${this.baseUrl}${urlAdTypePart}${urlCategoryPart}/stranica-${pageToIndex}`;
const singlePageResults = await this.indexSinglePage(
urlPageToCrawl,
this.maxResultsPerPage
);
if (Array.isArray(singlePageResults) && singlePageResults.length > 0) {
yield singlePageResults;
} else {
return undefined;
}
++pageToIndex;
if (pageToIndex === this.maxPages) {
return undefined;
}
}
} else {
return undefined;
}
}
async indexSinglePage(url, maxResultsPerPage) {
if (PRINT_CRAWLER_DEBUG) {
console.log("[RENTAL] Index page : ", url);
}
try {
const res = await fetch(url);
const body = await res.text();
const $ = cheerio.load(body);
let hrefs = [];
$(
"body > div > div.container > div.row > div.col-xs-12.col-sm-12.col-md-12.col-lg-9.content-main > div.row.box-items.group-grid-view"
)
.find(".pull-right")
.each((i, elem) => {
const href = $(elem)
.find("a")
.first()
.attr("href");
if (href) {
hrefs.push(href);
}
});
let actualNoOfResults =
hrefs.length <= maxResultsPerPage ? hrefs.length : maxResultsPerPage;
const asyncScraping = [];
for (let i = 0; i < actualNoOfResults; i++) {
asyncScraping.push(this.scrapeAd(hrefs[i]));
}
const scrapedData = await Promise.all(asyncScraping);
const filteredScrapedData = scrapedData.filter(adData => !!adData);
return filteredScrapedData;
} catch (e) {
console.error("[RENTAL] Exception caught:" + e);
return [];
}
}
async scrapeAd(url) {
console.log("[RENTAL] Scraping : ", url);
try {
const adPageSource = await fetch(url);
const body = await adPageSource.text();
const $ = cheerio.load(body);
const mapElementParent = $(".box-map").parent();
const scriptElement = $("script", mapElementParent);
if (
scriptElement[0] &&
scriptElement[0].children &&
scriptElement[0].children[0] &&
scriptElement[0].children[0].data
) {
let extractedData;
try {
//data string starts with : var json_map_data = [{"r ...
//so we remove first 20 characters
const jsonData = scriptElement[0].children[0].data.substring(20);
const parsedJsonData = JSON.parse(jsonData);
extractedData = parsedJsonData[0];
} catch (e) {
throw { message: "Can't find ad data JSON" };
}
const rentalId = extractedData["re_realEstates_id"];
const adCategory = this.getKiviCategoryIdFromRentalId(
parseInt(extractedData["re_types_id"])
);
if (!adCategory) {
throw {
message: `Invalid category : ${extractedData["re_types_id"]}`
};
}
const adType = this.getKiviAdTypeFromRentalActionId(
parseInt(extractedData["re_action_id"])
);
if (!adType) {
throw {
message: `Invalid ad type : ${extractedData["re_action_id"]}`
};
}
const title = extractedData["re_realEstates_portalName"];
const extractedPrice = parseFloat(
extractedData["re_realEstates_price"]
);
const price = extractedPrice ? extractedPrice : null;
const area = parseFloat(extractedData["re_realEstates_area"]);
const gardenSize = parseFloat(
extractedData["re_realEstates_fieldArea"]
);
const longDescription = htmlToText.fromString(
extractedData["re_realEstates_description"]
);
const locationLong = extractedData["re_realEstates_longitude"];
const locationLat = extractedData["re_realEstates_latitude"];
const publishedDateMoment = moment.tz(
extractedData["re_realEstates_inserted"],
RENTAL_ENUMS.RENTAL_PUBLISHED_DATE_FORMAT,
DEFAULT_TIMEZONE
);
if (!publishedDateMoment.isValid()) {
throw {
message: `Invalid published date : ${
extractedData["re_realEstates_inserted"]
}`
};
}
const renewedDateMoment = moment.tz(
extractedData["re_realEstates_edited"],
RENTAL_ENUMS.RENTAL_RENEWED_DATE_FORMAT,
DEFAULT_TIMEZONE
);
if (!renewedDateMoment.isValid()) {
throw {
message: `Invalid renewed date : ${
extractedData["re_realEstates_edited"]
}`
};
}
const adStatus = AD_STATUS.STATUS_NORMAL;
const data = {
url,
agencyObjectId: rentalId,
originAgencyName: AD_AGENCY.RENTAL,
realEstateType: adCategory,
adType,
title,
price,
area,
gardenSize,
shortDescription: "",
longDescription: longDescription,
streetNumber: 0,
streetName: "",
locality: "",
municipality: "",
city: "",
region: "",
entity: "",
country: "",
locationLat,
locationLong,
adStatus,
publishedDate: publishedDateMoment.toISOString(),
renewedDate: renewedDateMoment.toISOString()
};
return data;
} else {
console.log("[RENTAL] No JSON data for this ad : ", url);
return null;
}
} catch (e) {
console.error("[RENTAL] Exception caught: " + e.message, "\r\nURL:", url);
return null;
}
return null;
}
//======= HELPER FUNCTIONS =============
getKiviCategoryIdFromRentalId(rentalCategoryId) {
switch (rentalCategoryId) {
case 1:
return AD_CATEGORY.HOUSE.id;
case 2:
return AD_CATEGORY.FLAT.id;
case 3:
return AD_CATEGORY.APARTMENT.id;
case 4:
return AD_CATEGORY.OFFICE.id;
case 5:
return AD_CATEGORY.LAND.id;
case 6:
return AD_CATEGORY.GARAGE.id;
default:
return undefined;
}
}
getKiviAdTypeFromRentalActionId(actionId) {
switch (actionId) {
case 1:
return AD_TYPE.AD_TYPE_SALE.stringId;
case 2:
return AD_TYPE.AD_TYPE_RENT.stringId;
default:
return undefined;
}
}
async sleep(ms) {
return new Promise(resolve => setTimeout(resolve, ms));
}
async saveCrawledResults(results) {
const savers = this.savers;
// for (const saver of savers) {
// await saver.save(results);
// }
//For now, we use only Postgres saver, so ...
return await savers[0].save(results);
//so that we can use some sequelize options and information when data is inserted
}
}
module.exports = RentalCrawler;

View File

@@ -1,5 +1,5 @@
const isValidEmail = email => { const isValidEmail = email => {
const simpleEmailRegex = /^.+@.+\..+$/; const simpleEmailRegex = /^(([^<>()\[\]\\.,;:\s@"]+(\.[^<>()\[\]\\.,;:\s@"]+)*)|(".+"))@((\[[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}])|(([a-zA-Z\-0-9]+\.)+[a-zA-Z]{2,}))$/;
return email && email.length < 250 && simpleEmailRegex.test(email); return email && email.length < 250 && simpleEmailRegex.test(email);
}; };

View File

@@ -26,7 +26,7 @@ module.exports = (sequelize, DataTypes) => {
adType: { adType: {
type: DataTypes.TEXT, type: DataTypes.TEXT,
allowNull: false, allowNull: false,
defaultValue: AD_TYPE.AD_TYPE_SALE defaultValue: AD_TYPE.AD_TYPE_SALE.stringId
}, },
email: DataTypes.TEXT, email: DataTypes.TEXT,
locality: DataTypes.TEXT, locality: DataTypes.TEXT,

Binary file not shown.

After

Width:  |  Height:  |  Size: 2.2 MiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 30 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 216 KiB

View File

@@ -61,18 +61,6 @@ body {
box-sizing: border-box; box-sizing: border-box;
} }
#floating-panel {
top: 10px;
left: 25%;
z-index: 5;
background-color: #fff;
border: 1px solid #999;
text-align: center;
font-family: "Roboto", "sans-serif";
line-height: 30px;
padding: 5px 5px 5px 10px;
}
.btn:hover { .btn:hover {
background-color: white; background-color: white;
color: #02adba; color: #02adba;
@@ -95,7 +83,22 @@ h6.title {
margin-right: 10px; margin-right: 10px;
} }
.full-width {
width: 100%;
}
strong {
font-weight: bold;
}
h3 { h3 {
font-size: 15px; font-size: 15px;
line-height: 1.5; line-height: 1.5;
} }
.sliderInputBox {
box-shadow: 1px 1px 1px rgba(0, 0, 0, 0.4) !important;
border: 1px solid #02adba !important;
border-radius: 4px !important;
text-align: center;
}

29
app/public/segment.css Normal file
View File

@@ -0,0 +1,29 @@
.ui-segment {
color: #02adba;
border: 1px solid #02adba;
border-radius: 4px;
display: inline-block;
}
.ui-segment span.option.active {
background-color: #02adba;
color: white;
}
.ui-segment span.option {
padding-left: 30px;
padding-right: 30px;
height: 35px;
text-align: center;
display: inline-block;
line-height: 35px;
margin: 0px;
float: left;
cursor: pointer;
border-right: 1px solid #02adba;
}
.ui-segment span.option:last-child {
border-right: none;
}
.segment-select {
display: none;
}

View File

@@ -1,10 +1,30 @@
<div class="row centered-element"> <br>
Super. Poslali smo Vam potvrdni email na Vašu email adresu. <div class="row left-align">
Poslije tog emaila, svaki put kada nađemo nove nekretnine koje Vam odgovaraju <p>Super. Poslali smo Vam potvrdni email na Vašu email adresu.
javićemo Vam emailom. Poslije tog emaila, svaki put kada nađemo nove nekretnine koje Vam odgovaraju
<br><br> javićemo Vam emailom.
<a href="/" class=""> </p>
Nova pretraga <a href="/" class="">Nova pretraga</a>
</a> </div>
<div class="row left-align">
<p>Ako koristite gmail sa uključenom opcijom <strong>Promotions</strong>
postoji šansa da nećete dobiti notifikaciju kada email od Kivija dođe.
Da biste počeli dobijati notifikacije otiđite u <strong>Promotions</strong> label i
pomjerite email u <strong>Primary</strong> label kao na videu ispod.
</p>
<br>
<div class="hide-on-med-and-down">
<h6>Web preglednik</h6>
<br>
<img class="full-width" src="assets/images/web-promotions.gif">
<br>
<h6>Mobilna Gmail aplikacija</h6>
<br>
</div>
<div class="col s12 m6 l6 push-m3 push-l3">
<img src="assets/images/android-promotions.gif" width="100%">
</div>
</div> </div>

View File

@@ -10,7 +10,6 @@
gtag('config', '<%= process.env.GA_ID %>'); gtag('config', '<%= process.env.GA_ID %>');
</script> </script>
<link href="https://fonts.googleapis.com/css?family=Roboto&display=swap" rel="stylesheet">
<link href="https://fonts.googleapis.com/icon?family=Material+Icons" rel="stylesheet"> <link href="https://fonts.googleapis.com/icon?family=Material+Icons" rel="stylesheet">
<link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/materialize/1.0.0/css/materialize.min.css"> <link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/materialize/1.0.0/css/materialize.min.css">
<link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/noUiSlider/13.1.5/nouislider.min.css"> <link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/noUiSlider/13.1.5/nouislider.min.css">
@@ -19,6 +18,7 @@
<script src="https://code.jquery.com/jquery-2.1.1.min.js"></script> <script src="https://code.jquery.com/jquery-2.1.1.min.js"></script>
<meta charset="UTF-8" /> <meta charset="UTF-8" />
<link rel="stylesheet" href="/assets/main.css"> <link rel="stylesheet" href="/assets/main.css">
<link rel="stylesheet" href="/assets/segment.css">
<link rel="apple-touch-icon" sizes="180x180" href="/assets/apple-touch-icon.png"> <link rel="apple-touch-icon" sizes="180x180" href="/assets/apple-touch-icon.png">
<link rel="icon" type="image/png" sizes="32x32" href="/assets/favicon-32x32.png"> <link rel="icon" type="image/png" sizes="32x32" href="/assets/favicon-32x32.png">
@@ -28,6 +28,20 @@
<meta name="msapplication-TileColor" content="#da532c"> <meta name="msapplication-TileColor" content="#da532c">
<meta name="theme-color" content="#ffffff"> <meta name="theme-color" content="#ffffff">
<meta name="title" content="Kivi.ba">
<meta name="description" content="Neka dom nađe vas">
<meta property="og:title" content="Kivi.ba">
<meta property="og:description" content="Neka dom nađe vas">
<meta property="og:image" content="http://www.kivi.ba/assets/images/thumbnail-big.png">
<meta property="og:image:width" content="1200">
<meta property="og:image:height" content="627">
<meta property="og:url" content="https://www.kivi.ba">
<meta name="twitter:card" content="summary_large_image">
<meta property="og:site_name" content="Kivi.ba">
<meta name="twitter:image:alt" content="Kivi.ba - Neka dom nađe vas">
<%if (title) { %> <%if (title) { %>
<title> <%= title %> - Kivi.ba</title> <title> <%= title %> - Kivi.ba</title>
<% } else { %> <% } else { %>

View File

@@ -48,7 +48,21 @@
<script> <script>
$(document).ready( () => { $(document).ready( () => {
$("#submit").click( () => { $("#submit").click( () => {
$("#form-queryreview").submit(); const simpleEmailRegex = /^(([^<>()\[\]\\.,;:\s@"]+(\.[^<>()\[\]\\.,;:\s@"]+)*)|(".+"))@((\[[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}])|(([a-zA-Z\-0-9]+\.)+[a-zA-Z]{2,}))$/;
const email = $("#email").val();
const confirmEmail = $("#confirmEmail").val();
if (email !== confirmEmail){
$("#error-label-email").text("Greška ! Unešeni emailovi nisu isti");
return;
}
if (simpleEmailRegex.test(email)){
$("#submit").attr("disabled", true);
$("#form-queryreview").submit();
}else{
$("#error-label-email").text("Greška ! Unešeni emailovi nisu isti");
}
}); });
}); });
</script> </script>

View File

@@ -5,32 +5,57 @@
<h5>Cijena</h5> <h5>Cijena</h5>
<br><br> <br><br>
<div class="center-align no-ui-slider" id="priceFilter"></div> <div class="center-align no-ui-slider" id="priceFilter"></div>
<input type="hidden" id="priceFilterMin" name="priceFilterMin">
<input type="hidden" id="priceFilterMax" name="priceFilterMax">
</div> </div>
<br><br> <br>
<div class="row">
<div class="col s5 m3 l3 push-m1 push-l2">
<input class="sliderInputBox" type="number" id="priceMin" name="priceMin">
</div>
<div class="col s5 m3 l3 push-s1 push-m4 push-l4">
<input class="sliderInputBox" type="number" id="priceMax" name="priceMax">
</div>
</div>
<br>
<div class="row center-align"> <div class="row center-align">
<h5>Površina</h5> <h5>Površina</h5>
<br><br> <br><br>
<div class="center-align no-ui-slider" id="sizeFilter"></div> <div class="center-align no-ui-slider" id="sizeFilter"></div>
<input type="hidden" id="sizeFilterMin" name="sizeFilterMin">
<input type="hidden" id="sizeFilterMax" name="sizeFilterMax">
</div> </div>
<br><br> <br>
<div class="row">
<div class="col s5 m3 l3 push-m1 push-l2">
<input class="sliderInputBox" type="number" id="sizeMin" name="sizeMin">
</div>
<div class="col s5 m3 l3 push-s1 push-m4 push-l4">
<input class="sliderInputBox" type="number" id="sizeMax" name="sizeMax">
</div>
</div>
<br>
<% if(hasGardenSize) { %> <% if(hasGardenSize) { %>
<div class="row center-align"> <div class="row center-align">
<h5>Površina okućnice</h5> <h5>Površina okućnice</h5>
<br><br> <br><br>
<div class="center-align no-ui-slider" id="gardenSizeFilter"></div> <div class="center-align no-ui-slider" id="gardenSizeFilter"></div>
<input type="hidden" id="gardenSizeFilterMin" name="gardenSizeFilterMin">
<input type="hidden" id="gardenSizeFilterMax" name="gardenSizeFilterMax">
</div> </div>
<br><br> <br>
<div class="row">
<div class="col s5 m3 l3 push-m1 push-l2">
<input class="sliderInputBox" type="number" id="gardenSizeMin" name="gardenSizeMin">
</div>
<div class="col s5 m3 l3 push-s1 push-m4 push-l4">
<input class="sliderInputBox" type="number" id="gardenSizeMax" name="gardenSizeMax">
</div>
</div>
<% } %> <% } %>
<div class="row"> <div class="row">
@@ -42,42 +67,121 @@
<script> <script>
$(document).ready(() => { $(document).ready(() => {
const priceFormat = wNumb({ const priceSliderOptions = {...<%- priceSliderOptions %>};
thousand: ".", const sizeSliderOptions = {...<%- sizeSliderOptions %>};
suffix: " KM" const priceStep = priceSliderOptions.step;
}); const sizeStep = sizeSliderOptions.step;
const sizeFormat = wNumb({ delete priceSliderOptions.step;
thousand: ".", delete sizeSliderOptions.step;
suffix: " m2"
}); const updatePriceInputs = (values, handle, unencoded) => {
$("#priceMin").val(Math.round(unencoded[0]/priceStep)*priceStep);
$("#priceMax").val(Math.round(unencoded[1]/priceStep)*priceStep);
}
const updateSizeInputs = (values, handle, unencoded) => {
$("#sizeMin").val(Math.round(unencoded[0]/sizeStep)*sizeStep);
$("#sizeMax").val(Math.round(unencoded[1]/sizeStep)*sizeStep);
}
const priceSlider = document.getElementById("priceFilter"); const priceSlider = document.getElementById("priceFilter");
const extendedPriceSliderOptions = {...<%- priceSliderOptions %>, format: priceFormat};
noUiSlider.create(priceSlider, extendedPriceSliderOptions);
const sizeSlider = document.getElementById("sizeFilter"); const sizeSlider = document.getElementById("sizeFilter");
const extendedSizeSliderOptions = {...<%- sizeSliderOptions %>, format: sizeFormat};
noUiSlider.create(sizeSlider, extendedSizeSliderOptions); const priceSliderObject = noUiSlider.create(priceSlider, priceSliderOptions);
const sizeSliderObject = noUiSlider.create(sizeSlider, sizeSliderOptions);
priceSliderObject.on('slide', updatePriceInputs);
sizeSliderObject.on('slide', updateSizeInputs);
const priceMinChangeHandler = (element) => {
if (element && element.currentTarget && element.currentTarget.value){
const currentValues = priceSliderObject.get();
const newValue = element.currentTarget.value;
const fixedNewValue = newValue > currentValues[1] ? currentValues[1] : newValue;
priceSliderObject.set([fixedNewValue, null]);
$("#priceMin").val(Math.round(priceSliderObject.get()[0]));
}
}
const priceMaxChangeHandler = (element) => {
if (element && element.currentTarget && element.currentTarget.value){
const newValue = element.currentTarget.value;
priceSliderObject.set([null, newValue]);
$("#priceMax").val(Math.round(priceSliderObject.get()[1]));
}
}
$("#priceMin").val(priceSliderOptions.start[0]);
$("#priceMax").val(priceSliderOptions.start[1]);
$("#priceMin").change(priceMinChangeHandler);
$("#priceMax").change(priceMaxChangeHandler);
const sizeMinChangeHandler = (element) => {
if (element && element.currentTarget && element.currentTarget.value){
const currentValues = sizeSliderObject.get();
const newValue = element.currentTarget.value;
const fixedNewValue = newValue > currentValues[1] ? currentValues[1] : newValue;
sizeSliderObject.set([fixedNewValue, null]);
$("#sizeMin").val(Math.round(sizeSliderObject.get()[0]));
}
}
const sizeMaxChangeHandler = (element) => {
if (element && element.currentTarget && element.currentTarget.value){
const newValue = element.currentTarget.value;
sizeSliderObject.set([null, newValue]);
$("#sizeMax").val(Math.round(sizeSliderObject.get()[1]));
}
}
$("#sizeMin").val(sizeSliderOptions.start[0]);
$("#sizeMax").val(sizeSliderOptions.start[1]);
$("#sizeMin").change(sizeMinChangeHandler);
$("#sizeMax").change(sizeMaxChangeHandler);
<% if(hasGardenSize) { %> <% if(hasGardenSize) { %>
const gardenSizeSlider = document.getElementById("gardenSizeFilter"); const gardenSizeSliderOptions = {...<%- gardenSizeSliderOptions %>};
const extendedGardenSizeSliderOptions = {...<%- gardenSizeSliderOptions %>, format: sizeFormat}; const gardenSizeStep = gardenSizeSliderOptions.step;
noUiSlider.create(gardenSizeSlider, extendedGardenSizeSliderOptions); delete gardenSizeSliderOptions.step;
const updateGardenSizeInputs = (values, handle, unencoded) => {
$("#gardenSizeMin").val(Math.round(unencoded[0]/gardenSizeStep)*gardenSizeStep);
$("#gardenSizeMax").val(Math.round(unencoded[1]/gardenSizeStep)*gardenSizeStep);
}
const gardenSizeSlider = document.getElementById("gardenSizeFilter");
const gardenSizeSliderObject = noUiSlider.create(gardenSizeSlider, gardenSizeSliderOptions);
gardenSizeSliderObject.on('slide', updateGardenSizeInputs);
const gardenSizeMinChangeHandler = (element) => {
if (element && element.currentTarget && element.currentTarget.value){
const currentValues = gardenSizeSliderObject.get();
const newValue = element.currentTarget.value;
const fixedNewValue = newValue > currentValues[1] ? currentValues[1] : newValue;
gardenSizeSliderObject.set([fixedNewValue, null]);
$("#gardenSizeMin").val(Math.round(gardenSizeSliderObject.get()[0]));
}
}
const gardenSizeMaxChangeHandler = (element) => {
if (element && element.currentTarget && element.currentTarget.value){
const newValue = element.currentTarget.value;
gardenSizeSliderObject.set([null, newValue]);
$("#gardenSizeMin").val(Math.round(gardenSizeSliderObject.get()[0]));
}
}
$("#gardenSizeMin").val(gardenSizeSliderOptions.start[0]);
$("#gardenSizeMax").val(gardenSizeSliderOptions.start[1]);
$("#gardenSizeMin").change("step", gardenSizeMinChangeHandler);
$("#gardenSizeMax").change("step", gardenSizeMaxChangeHandler);
<% } %> <% } %>
$("#submit").click(() => { $("#submit").click(() => {
const priceFilterValues = priceSlider.noUiSlider.get(); const priceFilterValues = priceSlider.noUiSlider.get();
$("#priceFilterMin").val(priceFormat.from(priceFilterValues[0])); $("#priceFilterMin").val(priceFilterValues[0]);
$("#priceFilterMax").val(priceFormat.from(priceFilterValues[1])); $("#priceFilterMax").val(priceFilterValues[1]);
const sizeFilterValues = sizeSlider.noUiSlider.get(); const sizeFilterValues = sizeSlider.noUiSlider.get();
$("#sizeFilterMin").val(sizeFormat.from(sizeFilterValues[0])); $("#sizeFilterMin").val(sizeFilterValues[0]);
$("#sizeFilterMax").val(sizeFormat.from(sizeFilterValues[1])); $("#sizeFilterMax").val(sizeFilterValues[1]);
<% if (hasGardenSize) { %> <% if (hasGardenSize) { %>
const gardenSizeFilterValues = gardenSizeSlider.noUiSlider.get(); const gardenSizeFilterValues = gardenSizeSlider.noUiSlider.get();
$("#gardenSizeFilterMin").val(sizeFormat.from(gardenSizeFilterValues[0])); $("#gardenSizeFilterMin").val(gardenSizeFilterValues[0]);
$("#gardenSizeFilterMax").val(sizeFormat.from(gardenSizeFilterValues[1])); $("#gardenSizeFilterMax").val(gardenSizeFilterValues[1]);
<% } %> <% } %>
$("#filtersForm").submit(); $("#filtersForm").submit();

View File

@@ -1,25 +1,84 @@
<br><br>
<form method="POST" id="form-real-estate-type"> <form method="POST" id="form-real-estate-type">
<div class="row center-align"> <div class="center-align">
<div class="collection">
<% for(const realEstateType of realEstateTypes) { %>
<a href="#" class="waves-effect collection-item" <div class="row">
style="color: #02adba" <select class="segment-select" id="adType" name="adType">
id="<%= realEstateType.id %>" <option value="<%= AD_TYPE.AD_TYPE_SALE.id %>"
onclick="saveAndSubmit(this.id)" <% if (selectedAdType === AD_TYPE.AD_TYPE_SALE.id) { %>
> selected="selected"
<%= realEstateType.title %> <% } %>
</a> ><%= AD_TYPE.AD_TYPE_SALE.title %></option>
<option value="<%= AD_TYPE.AD_TYPE_RENT.id %>"
<% if (selectedAdType === AD_TYPE.AD_TYPE_RENT.id) { %>
selected="selected"
<% } %>
><%= AD_TYPE.AD_TYPE_RENT.title %></option>
</select>
</div>
<% } %> <br>
</div> <div id="realEstateTypeSelection" class="collection">
<input type="hidden" name="realEstateType" id="realEstateType" /> <% for(const realEstateType of realEstateTypes) { %>
</div>
<a class="waves-effect row collection-item"
id="<%= realEstateType.id %>"
href="#"
style="color: #02adba"
onclick="saveAndSubmit(this.id)"
>
<span class="center-align"><%= realEstateType.title %></span>
</a>
<% } %>
</div>
<input type="hidden" name="realEstateType" id="realEstateType" />
</div>
</form> </form>
<script> <script>
(function($) {
$.fn.extend({
Segment: function() {
$(this).each(function() {
const self = $(this);
const onchange = self.attr('onchange');
const wrapper = $("<div>", { class: "ui-segment" });
$(this)
.find("option")
.each(function() {
const option = $("<span>", {
class: "option",
onclick: onchange,
text: $(this).text(),
value: $(this).val(),
});
if ($(this).is(":selected")) {
option.addClass("active");
}
wrapper.append(option);
});
wrapper.find("span.option").click(function (){
wrapper.find("span.option").removeClass("active");
$(this).addClass("active");
self.val($(this).attr('value'));
});
$(this).after(wrapper);
$(this).hide();
});
}
});
})(jQuery);
$(document).ready(() => {
$(".segment-select").Segment();
});
function saveAndSubmit(id) { function saveAndSubmit(id) {
$("#realEstateType").val(id); $("#realEstateType").val(id);
$("#realEstateTypeSelection > a").attr("onclick", "");
$("#form-real-estate-type").submit(); $("#form-real-estate-type").submit();
} }
</script> </script>

View File

@@ -23,6 +23,7 @@ SOURCE_EMAIL=info@saburly.com
#=============== CRAWLER SETTINGS===============# #=============== CRAWLER SETTINGS===============#
CRAWLER_INTERVAL=Interval to run cralwer(s), in seconds CRAWLER_INTERVAL=Interval to run cralwer(s), in seconds
STOP_CRAWLER=Non-zero value will skip crawler execution STOP_CRAWLER=Non-zero value will skip crawler execution
PRINT_CRAWLER_DEBUG_INFO=Non-zero value will print crawler debugging info to the server console
#==OLX== #==OLX==
OLX_MAX_PAGES=Restrict crawler to this number of pages OLX_MAX_PAGES=Restrict crawler to this number of pages
OLX_MAX_RESULTS_PER_PAGE=Only this number or less results from one page will be scraped and saved OLX_MAX_RESULTS_PER_PAGE=Only this number or less results from one page will be scraped and saved
@@ -30,3 +31,24 @@ OLX_CRAWLER_AD_TYPE=enum name of what type of ads should be crawled, check commo
OLX_CRAWLER_AD_CATEGORIES=comma separated list of enum names of categories to be included, check common/enums.js file for valid values OLX_CRAWLER_AD_CATEGORIES=comma separated list of enum names of categories to be included, check common/enums.js file for valid values
OLX_IGNORED_USERNAMES=comma separated list of usernames to ignore OLX_IGNORED_USERNAMES=comma separated list of usernames to ignore
OLX_DELAY_BETWEEN_PAGES=time in miliseconds to wait before indexing next page OLX_DELAY_BETWEEN_PAGES=time in miliseconds to wait before indexing next page
#==RENTAL==
RENTAL_MAX_PAGES=Restrict crawler to this number of pages
RENTAL_MAX_RESULTS_PER_PAGE=Only this number or less results from one page will be scraped and saved
RENTAL_CRAWLER_AD_TYPE=enum name of what type of ads should be crawled, check common/enums.js file for valid values
RENTAL_CRAWLER_AD_CATEGORIES=comma separated list of enum names of categories to be included, check common/enums.js file for valid values
RENTAL_IGNORED_USERNAMES=!!! This is not used for rental crawler !!!
RENTAL_DELAY_BETWEEN_PAGES=time in miliseconds to wait before indexing next page
#==PROSTOR==
PROSTOR_MAX_PAGES=!!! This is not used for prostor crawler !!!
PROSTOR_MAX_RESULTS_PER_PAGE=For Prostor crawler, this represents MAX RESULTS in total
PROSTOR_CRAWLER_AD_TYPE=enum name of what type of ads should be crawled, check common/enums.js file for valid values
PROSTOR_CRAWLER_AD_CATEGORIES=comma separated list of enum names of categories to be included, check common/enums.js file for valid values
PROSTOR_IGNORED_USERNAMES=!!! This is not used for prostor crawler !!!
PROSTOR_DELAY_BETWEEN_PAGES=!!! This is not used for prostor crawler !!!
#==AKTIDO==
AKTIDO_MAX_PAGES=Restrict crawler to this number of pages
AKTIDO_MAX_RESULTS_PER_PAGE=Only this number or less results from one page will be scraped and saved
AKTIDO_CRAWLER_AD_TYPE=enum name of what type of ads should be crawled, check common/enums.js file for valid values
AKTIDO_CRAWLER_AD_CATEGORIES=comma separated list of enum names of categories to be included, check common/enums.js file for valid values
AKTIDO_IGNORED_USERNAMES=!!! This is not used for aktido crawler !!!
AKTIDO_DELAY_BETWEEN_PAGES=time in miliseconds to wait before indexing next page

View File

@@ -1,6 +1,6 @@
"use strict"; "use strict";
const olxCrawler = require("../app/crawler/specific/olx"); const olxCrawler = require("../app/crawler/specificCrawlers/olx");
const urlToScrape = process.argv[2] || undefined; const urlToScrape = process.argv[2] || undefined;