Skip to content

Generate visual podcasts about novels using open source models

License

Notifications You must be signed in to change notification settings

jquesnelle/literAI

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

28 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

literAI

Colab

Demo: https://literai.hooloovoo.ai (source)

literAI is an experiment in open source AI composition written by emozilla. Originally inspired by scribepod by yacine, it creates a podcast where the two hosts, Alice and Bob, analyze a novel they both purportedly recently read, along with associated images generated from inferred descriptions of scenes in the novel. Cricually, literAI uses exclusively open source AI models (no API calls) and is designed to run on (admittedly high-end) consumer-grade hardware. It requires 24 GB of VRAM, although it is likely possible it could be tweaked to work with less.

Models used

Model Purpose
pszemraj/long-t5-tglobal-xl-16384-book-summary Generate summaries of the novel text
allenai/cosmo-xl Conversation generation
google/flan-t5-xl Scene description summarization from novel passages
dreamlike-art/dreamlike-diffusion-1.0 Image generation

Packages/tools used

Package Purpose
transformers Run LLMs
diffusers Run diffusion models
textsum Automate summary batching
LangChain LLM context and prompt construction
TorToiSe Audio generation
pydub Audio stiching

Running

To run, clone the repository and install neccessary requirements.

git clone https://github.com/jquesnelle/literAI
cd literAI
python -m pip install -r ./requirements.txt

Then, pass the novel's title, author, and path to the raw UTF-8 encoded text file to the literai module.

python -m literai "Alice's Adventures in Wonderland" "Lewis Carroll" alice-in-wonderland.txt

Note: this may take a while. A 24 GB CUDA-capable video card is highly recommended. The generated data will be in the output/ folder.

Running incrementally

Generating a literAI podcast is done in six steps, which the main literai command combines together. The steps are:

  1. Generate summaries
  2. Generate dialogue script
  3. Generate image descriptions
  4. Generate images
  5. Generate audio
  6. (optional) Add to index file and upload to Google Cloud Storage

Each of these steps can be invoked separately. For example, to re-create the dialogue script (it's random each time)

python -m literai.steps.step2 "Alice's Adventures in Wonderland" "Lewis Carroll"

About

Generate visual podcasts about novels using open source models

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages