Skip to content

GitHub Repository containing code samples for Tiki crawling project

Notifications You must be signed in to change notification settings

ngtridung97/Tiki-Crawling

Repository files navigation

Tiki is one of the most popular e-commerce websites in Vietnam. The purpose of this project is crawling as much as possible product information, then reshape them in PostgreSQL. Scripts can be fired daily to collect historical pricing data for a week.

Features:

See how it works below

Introduction


Phase 01 - Gather product URLs list, apply conditions, send an email whether that condition was met.

Phase 02 - Gather 16 main-category URLs list, loop until the last page of each URL.

Phase 03 - Gather all active category Urls list, classify leaf category URLs, loop until the last page of each leaf.

Phase 04 - Get Tiki's Category and Product API.

Phase 05 - Scrapy, seller_id and configurable_product (In Progress).

Feedback & Suggestions


Please feel free to fork, comment or give feedback to [email protected]

About

GitHub Repository containing code samples for Tiki crawling project

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published