ℹ️ NOTE: The Orcinus project is now in "release candidate" status. Please use it on your websites and send any feedback my way, or submit issue reports here on Github. Thanks so much for your interest!
The Orcinus Site Search PHP script is an all-in-one website crawler, indexer and search engine that extracts searchable content via HTTP/HTTPS from plain text, XML, HTML and PDF files at a single, or multiple websites. It replaces 3rd party, remote search solutions such as Google etc.
Orcinus will crawl your website content on a schedule, or at your command via the admin UI or even by CLI / crontab. Crawler log output conveniently informs you of missing pages, broken links or links that redirect, and other errors that a webmaster can fix to keep the user experience tight. A full-featured, Bootstrap-based responsive administration GUI allows you to adjust crawl settings, view and edit all crawled pages, customize search results, and view a log of user search queries. You also have complete control over the appearance of your search results with a convenient templating system.
Optionally, Orcinus can generate a sitemap .xml or .xml.gz file of your pages after every crawl, suitable for uploading to the Google Search Console. It can also export a JavaScript version of the entire search engine that works with offline mirrors, such as those generated by HTTrack.
- PHP >= 8.1
- MySQL >= 8.0.17 / MariaDB >= 10.0.5
Included:
Optional:
- Copy the
orcinus
directory to your root web directory. - Fill out your SQL and desired credential details in the
orcinus/config.ini.php
file. - Visit
yourdomain.com/orcinus/admin.php
in your favourite web browser and log in. - Optionally follow the instructions in
orcinus/geoip2/README.md
to enable geolocation of search queries.
Examples of search interface integration are given in the example.php
(online / PHP) and example.html
(offline / JavaScript) files.