Skip to content
/ kiba-extend-project Public template

A sample ETL project using kiba-extend

Notifications You must be signed in to change notification settings

lyrasis/kiba-extend-project

Repository files navigation

kiba-extend project sample

A sample base project using kiba-extend.

Includes heavily commented/explained code for:

Prerequisites

You must have a modern Ruby installed. This should work with 3.1.0 and up.

💡
It is highly recommended you use a version manager. The author is using rbenv.

Try it out

Set it up

Clone this repository.

cd into the top level of your repository.

Do bundle install

OPTIONAL: Add kiba-extend-project/bin to your PATH so you can run commands without needing to type bundle exec each time.

Try some commands

Now you should be able to run the thor tasks for the project. In your terminal, get a list of the project’s defined files/jobs:

thor reg:list

Jobs tagged with place and cleanup are dynamically set up for you via the IterativeCleanup mixin added in kiba-extend v4.0.0:

thor jobs tagged_and --tags=place cleanup

Run the locations__to_json job (which has the effect of running all other jobs as its dependencies):

thor run:job locations__to_json

More about the thor CLI here.

Run the tests

rspec

or

bundle exec rake spec

Explore the code and the data

The code is heavily commented to explain what things are doing.

To understand the relationship between lib/ke_project/registry_data.rb and setting up jobs, I recommend you start by looking at that file and lib/ke_project/everything_exploded.rb, which has more comments explaining how this works. Then look at the files in lib/ke_project/locations for the current recommended approach.

The source files are in the specified locations, so jobs should run. You must run the jobs to see the results.

Start your own project

Option 1: Use this repo as a template

Create a repository for your project using this repository as a template.

In your new repository:

  • global find/replace KeProject and ke_project with your project name

  • set your configurations as necessary

  • delete comments and stuff you aren’t using

ℹ️
Ruby conventions for naming

Camel case is for names of modules and classes in your code.

When naming files that correspond to said modules or classes, you downcase and separate with underscore instead of camel casing.

Run bundle install in your repo directory.

Configure your project settings in your lib/ke_project.rb equivalent.

Start setting up and running jobs!

Option 2: Build your project manually following patterns in this repo

Pro

less tedious find/replace and deleting stuff

Con

easier to leave a necessary piece out or introduce typos/other errors

Notes on code organization

With the exception of registry_data.rb, the structure of directories and files in lib/ke_project is fairly arbitrary, in the sense of how you name and structure modules and submodules, and what methods you define in different modules. For example, you could define all your jobs in one big module file if you wanted.

However, the author has thus far found it useful to set up the structure has shown in the lib/ke_project/jobs/locations directory.

Using Zeitwerk to handle code loading introduces a bit some constraint on how you can organize your code. As long as you follow common Ruby conventions of defining one module or class per file, and naming the file after the module or class it defines, you should be good. See Zeitwerk’s file structure documentation for more details.

More documentation

About

A sample ETL project using kiba-extend

Resources

Stars

Watchers

Forks

Packages

No packages published