Skip to content
forked from s-nt-s/LFS201

Save Linux Foundation online training courses for offline use

License

Notifications You must be signed in to change notification settings

kanner/lfs-crawler

 
 

Repository files navigation

lfs-crawler allows to save LF courses for offline use

Linux Foundation online training courses are entirely designed to be accessed online via 360training.com platform. This project serves to automate text extraction of the course materials for offline use and it's conversion to a clean and more manageable html format.

Acknowledgment:

This is the fork of https://github.com/s-nt-s/LFS201 repository (originally in Spanish) adapted for English versions of several LF online training courses (currently: LFS201, LFS211, LFS216, LFS265, LFC191). See all the previous credits in original repository (or here in LFCS/APUNTES.md for example), I have only adapted the project to English version (partially) and added support for more courses.

WARNING: All rights on the courses materials belongs to the Linux Foundation. I will not publish this materials on public for everyone. You can use this project only if you got access to the supported courses from Linux Foundation and want to save some important information for future preparation for LF certification (LFCS, LFCE, etc — see https://training.linuxfoundation.org/certification).

Pre-Requirements:

  • First, you should have access to one of the supported LF online training courses, otherwise don't use this project.

  • In order to use the project you should install Chrome/Chromium web browser.

  • Checkout appropriate branch for supported LF course.

  • You will need to set some preferences in Chrome/Chromium browser and project files:

    • disable the option "Ask where to save each file before downloading" in browser settings;

    • set "Download location" in browser settings and set the same value to CHROME_DOWNLOAD_PATH variable in run.sh script;

    • use "Load unpacked extension" in Chrome/Chromium Extensions tab and add chrome folder of this project;

    • With subsequent use of this Chrome/Chromium extension (named Autosave) on request you should allow multiple files to be saved (depends on Chrome/Chromium version).

How to use:

  1. First, with Autosave Chrome/Chromium extension enabled just start your LF online training course:
  • In course window the Autosave extension should start navigating the slides one by one and saving rendered content of each slide (with some exceptions) in Chrome/Chromium download location;

  • You can stop (by closing the course window or disabling the extension) or wait until the course is over and every slide is saved to Chrome/Chromium download location.

  1. Second, execute run.sh script from project folder to perform following actions: moving all saved content to the working directory (lfs-crawler/html/{orig,clean}), cleaning the content by simplifying its format with clean.py, binding and indexing with headers all the modified content to the output directory (lfs-crawler/out/) with join.py.

  2. Now you can open the resultant html file from lfs-crawler/out/ and see all the text materials there.

Limitations:

  • For each supported LF training course there is a separate git branch (yeah, I was too lazy to merge all the courses support in master branch). The lfs201 branch is the same as master branch, other branches usually differ only in 1-2 last commits (different parsing methods used).

  • By default the auto-navigation feature is on, but the popups will not be visited (should be visited manually to get saved). You could turn auto-navigation off by commenting out the line setTimeout(this.nav,5000); in chrome/content/js.js and reloading the extension ("Reload" link in Extensions tab).

  • Some scripts from original Spanish version maybe broken : obj.py, fix/mdtohtml.py, LFCS/videos.sh (obsolete links). I didn't bother myself trying to port all the project features (pandoc, markdown or epub creating; error corrections in join.py according to the Spanish version; converting original author notes and so on).

  • Other scripts are usable or partially usable: epub.py, tecmint.py, labs.sh, run.sh (executes clean.py and join.py).

  • Current version don't save images, audio and video materials — only text is saved (plus images from popups if you execute img-extractor.sh in lfs-crawler/out/ after run.sh).

  • Current version have problems with hyperlinks (they're implemented as javascript functions with names hyperlink***, not as <a href=...>...</a>).

About

Save Linux Foundation online training courses for offline use

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 69.1%
  • Shell 17.1%
  • CSS 9.5%
  • JavaScript 4.3%