Skip to content

meilibridge is a robust package designed to seamlessly sync data from both SQL and NoSQL databases to Meilisearch, providing an efficient and unified search solution.

License

Notifications You must be signed in to change notification settings

Ja7ad/meilibridge

Repository files navigation

meilibridge

meilibridge is a robust package designed to seamlessly sync data from both SQL and NoSQL databases to Meilisearch, providing an efficient and unified search solution.

Table of content

Features

  • Supports multiple data sources
  • Compatible with various databases such as MongoDB (currently supported), MySQL, and PostgreSQL
  • Index configuration options
  • Real-time synchronization (currently mysql and postgres not supported)
  • Bulk sync support with options to continue or reindex
  • Concurrent data bridging to Meilisearch
  • Customizable fields for indexing
  • Set primary key for index
  • Many meilisearch for specific bridge
  • Scheduled automatic bulk sync on existing index in the background
  • Trigger sync for real-time sync by webhook trigger

Installation

You can install meilibridge using different methods.

Release

Download the latest release from here.

Go Installation

Install the package using Go:

go install github.com/Ja7ad/meilibridge/cmd/meilibridge@latest

Docker

Run meilibridge with real-time sync using Docker:

  • Docker
docker run --rm -it -v ./config.yml:/etc/meilibridge/config.yml --name meilibridge ja7adr/meilibridge
  • Docker compose
version: "3"
services:
  meilibridge:
    image: ja7adr/meilibridge:latest
    volumes:
      - ./config.yml:/etc/meilibridge/config.yml
    restart: always

Example Configuration

example configuration for run meilibridge

general:
  # The trigger sync method is a new way to synchronize data by receiving signals from webhooks.
  # It creates a custom route for each bridge to receive the signal and initiate data synchronization.
  # For example: http://127.0.0.1:8800/{bridge_name}
  trigger_sync:
    token: foobar # The token secures your webhook and must be sent in the header with the key "x-token-key".
    listen: 127.0.0.1:8800
  auto_bulk_interval: 1800 # auto bulk continue data on exists index, default is 1800 second (30 min)
  pprof:
    enable: false
    listen: 127.0.0.1:9900

bridges:
  - name: bridge1 # name is required

    meilisearch:
      # API address of meilisearch
      api_url: http://127.0.0.1:7700
      # master key https://www.meilisearch.com/docs/learn/security/differences_master_api_keys#master-key
      # optional
      api_key: foobar

    database:
      # database engine mongo, mysql, postgres
      engine: mongo
      host: "localhost"
      port: 27017
      user: "foo"
      password: "bar"
      database: "foobar"
      # custom parameter for database engine key:val
      custom_params:
        directConnection: true
        replicaSet: test

    # index map is collection or table of data source to meilisearch index
    # source collection or table -> index
    index_map:
      # if you want sync view table should original_name_table:view_name
      col1:col1_view_name:
        index_name: idx1
        # set pk for fields in meilisearch, note if set value for fields please enter value not database key.
        # it's require.
        # for mongodb use field _id for primary key.
        # https://www.meilisearch.com/docs/learn/core_concepts/primary_key#primary-field
        primary_key: id
        fields:
          _id: id
          first_name:
          last_name:
          age:
          created_at:

        settings:
          # list of strings Meilisearch should parse as a single term, default is empty
          # https://www.meilisearch.com/docs/reference/api/settings#dictionary
          dictionary:
            - foo
            - bar

          # he distinct attribute is a special, user-designated field. It is most commonly used to prevent Meilisearch
          # from returning a set of several similar documents, instead forcing it to return only one, default is empty
          # https://www.meilisearch.com/docs/learn/relevancy/distinct_attribute#setting-a-distinct-attribute-during-configuration
          distinct_attribute: foo

          # fields displayed in the returned documents, default is all attributes
          # https://www.meilisearch.com/docs/reference/api/settings#displayed-attributes
          displayed_attributes:
            - foo
            - bar

          # faceting settings
          # https://www.meilisearch.com/docs/reference/api/settings#faceting-object
          faceting:
            # maximum number of facet values returned for each facet. Values are sorted in ascending lexicographical order
            # default is 100
            max_values_per_facet: 100

          # attributes to use as filters and facets, default is empty
          # https://www.meilisearch.com/docs/reference/api/settings#filterable-attributes
          filterable_attributes:
            - first_name
            - last_name

          # fields in which to search for matching query words sorted by order of importance, default is all attributes ["*"]
          # https://www.meilisearch.com/docs/reference/api/settings#searchable-attributes
          searchable_attributes:
            - first_name
            - last_name
            - age

          # attributes to use when sorting search results, default is empty
          # https://www.meilisearch.com/docs/reference/api/settings#sortable-attributes
          sortable_attributes:
            - age

          # pagination settings
          # https://www.meilisearch.com/docs/reference/api/settings#pagination
          pagination:
            # the maximum number of search results Meilisearch can return, default is 1000
            # note: setting maxTotalHits to a value higher than the default will negatively impact search performance.
            # setting maxTotalHits to values over 20000 may result in queries taking seconds to complete.
            max_total_hits: 500

          # precision level when calculating the proximity ranking rule, default is "byWord"
          # https://www.meilisearch.com/docs/reference/api/settings#proximity-precision
          proximity_precision: "byWord"

          # list of ranking rules in order of importance,
          # default is ["words", "typo", "proximity", "attribute", "sort", "exactness"]
          # https://www.meilisearch.com/docs/reference/api/settings#ranking-rules
          ranking_rules:
            - "words"
            - "typo"

          # maximum duration of a search query for null set 0, default is 1500
          # https://www.meilisearch.com/docs/reference/api/settings#search-cutoff
          search_cutoff_ms: 500

          # list of characters delimiting where one term begins and ends, default is empty
          # https://www.meilisearch.com/docs/reference/api/settings#separator-tokens
          separator_tokens:
            - foo
            - bar

          # list of characters not delimiting where one term begins and ends, default is empty
          # https://www.meilisearch.com/docs/reference/api/settings#non-separator-tokens
          non_separator_tokens:
            - foo
            - bar

          # list of words ignored by Meilisearch when present in search queries, default is empty
          # https://www.meilisearch.com/docs/reference/api/settings#stop-words
          stop_words:
            - foo
            - bar

          # list of associated words treated similarly, default is empty
          # https://www.meilisearch.com/docs/reference/api/settings#synonyms
          synonyms:
            wolverine:
              - foo
              - bar
            logan:
              - x
              - y
              - z

          # typo tolerance settings
          # https://www.meilisearch.com/docs/reference/api/settings#typo-tolerance
          typo_tolerance:
            # whether typo tolerance is enabled or not, default is true
            enabled: true

            # the minimum word size for accepting 2 typos; must be between oneTypo and 255, default is 9
            min_word_size_for_typos:
              one_typo: 5
              two_typos: 9

            # an array of words for which the typo tolerance feature is disabled, default is empty
            disable_on_words:
              - foo
              - bar

            # an array of attributes for which the typo tolerance feature is disabled, default is empty
            disable_on_attributes:
              - foo
              - bar

          # embedders translate documents and queries into vector embeddings. You must configure at
          # least one embedder to use AI-powered search, this is experimental.
          # https://www.meilisearch.com/docs/reference/api/settings#embedders-experimental
          embedders:
            embedder1:
              source: source1
              api_key: apikey1
              model: model1
              dimensions: 128
              document_template: template1

            embedder2:
              source: source2
              api_key: apikey2
              model: model2
              dimensions: 128
              document_template: template2

      col2:
        index_name: idx2
        primary_key: id
        fields:
        settings:

  - name: bridge 2

    meilisearch:
      api_url: http://127.0.0.1:7700
      api_key: foobar

    database:
      engine: mysql
      host: "localhost"
      port: 6315
      user: "foo"
      password: "bar"
      database: "foobar"

    index_map:
      col1:
        index_name: idx1
        primary_key: id
        fields:
        settings:

How to run?

$ meilibridge -h
Meilibridge is a robust package designed to seamlessly sync data from both SQL and NoSQL databases to Meilisearch,
providing an efficient and unified search solution.

Usage:
  meilibridge [command]

Available Commands:
  help        Help about any command
  sync        Bulk or real-time sync
  version     Print the version number

Flags:
  -h, --help   Help for meilibridge

Use "meilibridge [command] --help" for more information about a command.

Bulk Sync

Bulk sync recreates the index and syncs all data to Meilisearch.

$ meilibridge sync bulk -h
Start bulk sync operation.

Usage:
  meilibridge sync bulk [flags]

Flags:
  -c, --config string   Path to config file (default "/etc/meilibridge/config.yml")
      --continue        Sync new data on existing index
  -h, --help            Help for bulk

Example:

$ meilibridge sync bulk -c ./config.yml

Bulk Sync with Continue

Bulk sync continues to sync new data to Meilisearch on an existing index.

$ meilibridge sync bulk -h
Start bulk sync operation.

Usage:
  meilibridge sync bulk [flags]

Flags:
  -c, --config string   Path to config file (default "/etc/meilibridge/config.yml")
      --continue        Sync new data on existing index
  -h, --help            Help for bulk

Example:

$ meilibridge sync bulk -c ./config.yml --continue

Auto bulk sync

Scheduled automatic bulk sync on existing index in the background.

$ meilibridge sync bulk -h
Start bulk sync operation.

Usage:
  meilibridge sync bulk [flags]

Flags:
  -c, --config string   Path to config file (default "/etc/meilibridge/config.yml")
      --continue        Sync new data on existing index
      --auto            Auto bulk sync on exists index every n seconds"
  -h, --help            Help for bulk

Example:

$ meilibridge sync bulk -c ./config.yml --auto

Real-time Sync

meilibridge supports real-time data synchronization on write operations of the database by watching or triggering events.

$ meilibridge sync start -h
Start real-time sync operation.

Usage:
  meilibridge sync start [flags]

Flags:
  -c, --config string   Path to config file (default "/etc/meilibridge/config.yml")
  -h, --help            Help for start

Example:

$ meilibridge sync start -c ./config.yml

Trigger Sync

meilibridge supports trigger synchronization with specific webhooks for indexes. Each index has a unique webhook at the path /{bridge_name}/{index_name}.

$ meilibridge sync trigger -h
start trigger sync

Usage:
  meilibridge sync trigger [flags]

Flags:
  -c, --config string   path to config file (default "/etc/meilibridge/config.yml")
  -h, --help            help for trigger

Example:

$ meilibridge sync trigger -c ./config.yml

Example API call to the webhook:

curl --location 'http://127.0.0.1:8800/bridge/foo' \
--header 'x-token-key: foobar' \
--header 'Content-Type: application/json' \
--data '{
    "index_uid": "foo",
    "type": "UPDATE",
    "document": {
        "primary_key": "_id",
        "primary_value": "65d0982320ff6c9a9a09eca2"
    }
}'
  • index_uid: The index UID in Meilisearch.
  • type: The operation type for the document (INSERT, UPDATE, DELETE).
  • document: The object used to find the document.
  • document.primary_key: The column name or field name that serves as the primary key for the Meilisearch index.
  • document.primary_value: The specific value used to find the document in the database table or collection for synchronization with Meilisearch.

About

meilibridge is a robust package designed to seamlessly sync data from both SQL and NoSQL databases to Meilisearch, providing an efficient and unified search solution.

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages