Biblical data including translations, tagged original language texts, second temple literature, early church writings, dictionaries, and cross references.
Legend:
- π·οΈ Morphologically Tagged
- π² Syntax Trees
- π¬ Discourse Analysis
- Aligned Bible Texts, automatic and/or manually corrected.
- Bible corpus - A multilingual parallel corpus created from translations of the Bible.
- Gratis Bible (OSIS XML)
- Open English Bible - A CC0 Bible translation.
- Parallel corpora from eBible.org (Verse per line txt). source. - Made for use with NLP, not ideal for finding book/ch/v divisions.
- Unfolding Word Translations - See esp. their Literal Translation, Simplified Translation. Resources developed for Bible translators.
- Zefania Bibles - A corpus of 140+ Bibles in 63 languages (and some English/German resources such as concordances). The Bibles are formatted in "Zefania XML". Some include strongs tagging.
OT
- ETCBC BHSa (TextFabric) π² π·οΈ
- ETCBC BHSa - Hierarchical XML format π² π·οΈ - For those who prefer XML to TextFabric, in canonical and reordered versions
- Macula Hebrew π² - One of the most developed datasets. Combines multiple sources, with clear provenance!
- MorphHB π·οΈ - Crowd sourced tagging of the OT
- Peshitta (TextFabric)
- Speaker Quotations for the whole Bible in various translations and the original languages. π¬
- STEPBible Data π·οΈ - One of the most developed datasets
LXX
- CCAT LXX in sqlite π·οΈ
- LXX Codex Alexandrinus
- STEPBible Data π·οΈ - Appears to only be available upon request
- Swete's LXX Text from 1KY corrected π·οΈ
NT
- Byzantine Majority Text π·οΈ
- SBLGNT - Source data for the SBL GNT published by Logos.
- SBLGNT Tagged by MorphGNT π·οΈ
- Levinsohn's Greek New Testament Discourse Features π¬
- Macula Greek π² π·οΈ - One of the most developed datasets. Combines multiple sources, with clear provenance!
- NA1904 Tagged by MorphGNT π·οΈ
- OpenText Context Annotations extracted from forthcoming OpenText 2.0 syntax data, these include pericopes, speaker turns (maps to Speaker Quotations dataset above), moves within turns, and tokens/expressions for mapping to other datasets. π¬π²
- PROIEL Treebanks (GNT, Vulgate, other NTs + more) π²
- SBLGNT and Nestle1904 with syntax trees by the Global Bible Initiative π² π·οΈ
- Statistical Restoration GNT π·οΈ - An approach to construct a critical NT based on the earliest evidence (was Bunning Heuristic Prototype GNT)
- STEPBible Data π·οΈ - One of the most developed datasets!
- Syriac New Testament (TextFabric)
- Online-Critical-Pseudepigrapha - multiple versions of the Pseudepigrapha, powering http://pseudepigrapha.com
- (English) https://github.com/scrollmapper/bible_databases_deuterocanonical
- (English/Hebrew) https://github.com/Sefaria/Sefaria-Export
- https://github.com/ETCBC/dss (TextFabric) π·οΈ
- https://github.com/ETCBC/extrabiblical
- https://github.com/Sefaria/Sefaria-Export
- (English) Ante- and Post-Nicene Fathers (TEI XML)
- (Greek) Apostolic Fathers hand corrected.
- (Greek) Clement of Alexandria
- (Greek) Justin Martyr
- (Greek) Patristics (TextFabric)
- CCEL Reference Mappings - Converted to sqlite. Note, the original mappings are no longer at the CCEL URL.
- Copenhagen Alliance Versification Mappings
- STEPBible TVTMS - Translators Versification Traditions with Methodology for Standardisation
- Abbott-Smith (NT)
- BDB (OT)
- Jeffrey Dodson's Greek Lexicon (NT)
- Koine Greek English Dictionary CC0. Updated Strongs (likely only NT Greek).
- UBS Dictionary of Biblical Hebrew & Greek CC-BY-SA. Extracted from the SDBH and SDGNT.
- Strongs (OT +Β NT)
- Koine Greek to Chinese Dictionary CC0. Apparently a conversion of the "Koine Greek English Dictionary" to Chinese. Because it's strongs based, likely only NT Greek. "This dictionary contains Chinese glosses where a gloss is available from the biblical text database."