Skip to content
This repository has been archived by the owner on Oct 17, 2021. It is now read-only.

Latest commit

 

History

History
45 lines (31 loc) · 1.02 KB

README.md

File metadata and controls

45 lines (31 loc) · 1.02 KB

clj-scraper

A web-scraper for personal enjoyment and experiments with core/async. Supports two websites for your scraping pleasure.

Requirements

  1. Leiningen
  2. JDK >= 1.6

Building

$ lein uberjar

Usage

java -jar target/scraper-0.3.1-standalone.jar

Options

-c, --cache [dir]           cache files directory
-o, --output [dir]          downloaded images directory
-w, --workers [num]         number of download workers
-d, --debug                 display debug info
-s, --source [ngo|vrotmne]  handle of website to scrape
-S, --skip [num]            skip first num posts of LJ
-L, --list-only             save image urls, but don't download
-x, --exit-on-exist         exit the process if downloaded file exists
-h, --help                  print this help

Examples

$ java jar target/scraper-0.3.1-standalone.jar -w 20 -s ngo

License

Copyright © 2013 FIXME

Distributed under the Eclipse Public License, the same as Clojure.