clj-scraper

A web-scraper for personal enjoyment and experiments with core/async. Supports two websites for your scraping pleasure.

Requirements

Leiningen
JDK >= 1.6

Building

$ lein uberjar

Usage

java -jar target/scraper-0.3.1-standalone.jar

Options

-c, --cache [dir]           cache files directory
-o, --output [dir]          downloaded images directory
-w, --workers [num]         number of download workers
-d, --debug                 display debug info
-s, --source [ngo|vrotmne]  handle of website to scrape
-S, --skip [num]            skip first num posts of LJ
-L, --list-only             save image urls, but don't download
-x, --exit-on-exist         exit the process if downloaded file exists
-h, --help                  print this help

Examples

$ java jar target/scraper-0.3.1-standalone.jar -w 20 -s ngo

License

Distributed under the Eclipse Public License, the same as Clojure.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

clj-scraper

Requirements

Building

Usage

Options

Examples

License

Files

README.md

Latest commit

History

README.md

File metadata and controls

clj-scraper

Requirements

Building

Usage

Options

Examples

License