Machine Learningedit

Trained modelsedit

Eland allows transforming trained models from scikit-learn, XGBoost, and LightGBM libraries to be serialized and used as an inference model in Elasticsearch.

>>> from xgboost import XGBClassifier
>>> from eland.ml import MLModel

# Train and exercise an XGBoost ML model locally
>>> xgb_model = XGBClassifier(booster="gbtree")
>>> xgb_model.fit(training_data[0], training_data[1])

>>> xgb_model.predict(training_data[0])
[0 1 1 0 1 0 0 0 1 0]

# Import the model into Elasticsearch
>>> es_model = MLModel.import_model(
    es_client="http://localhost:9200",
    model_id="xgb-classifier",
    model=xgb_model,
    feature_names=["f0", "f1", "f2", "f3", "f4"],
)

# Exercise the ML model in Elasticsearch with the training data
>>> es_model.predict(training_data[0])
[0 1 1 0 1 0 0 0 1 0]

Natural language processing (NLP) with PyTorchedit

You need to install the appropriate version of PyTorch to import an NLP model. Run python -m pip install 'eland[pytorch]' to install that version.

For NLP tasks, Eland enables you to import PyTorch models into Elasticsearch. Use the eland_import_hub_model script to download and install supported transformer models from the Hugging Face model hub. For example:

$ eland_import_hub_model <authentication> \ 
  --url http://localhost:9200/ \ 
  --hub-model-id elastic/distilbert-base-cased-finetuned-conll03-english \ 
  --task-type ner \ 
  --start

Use an authentication method to access your cluster. Refer to Authentication methods.

The cluster URL. Alternatively, use --cloud-id.

Specify the identifier for the model in the Hugging Face model hub.

Specify the type of NLP task. Supported values are fill_mask, ner, question_answering, text_classification, text_embedding, and zero_shot_classification.

Import model with Dockeredit

To use the Docker container, you need to clone the Eland repository: https://github.com/elastic/eland

If you want to use Eland without installing it, you can use the Docker image:

You can use the container interactively:

$ docker run -it --rm --network host docker.elastic.co/eland/eland

Running installed scripts is also possible without an interactive shell, for example:

docker run -it --rm docker.elastic.co/eland/eland \
    eland_import_hub_model \
      --url $ELASTICSEARCH_URL \
      --hub-model-id elastic/distilbert-base-uncased-finetuned-conll03-english \
      --start

Replace the $ELASTICSEARCH_URL with the URL for your Elasticsearch cluster. For authentication purposes, include an administrator username and password in the URL in the following format: https://username:password@host:port.

Install models in an air-gapped environmentedit

You can install models in a restricted or closed network by pointing the eland_import_hub_model script to local files.

For an offline install of a Hugging Face model, the model first needs to be cloned locally, Git and Git Large File Storage are required to be installed in your system.

  1. Select a model you want to use from Hugging Face. Refer to the compatible third party model list for more information on the supported architectures.
  2. Clone the selected model from Hugging Face by using the model URL. For example:

    git clone https://huggingface.co/dslim/bert-base-NER

    This command results in a local copy of of the model in the directory bert-base-NER.

  3. Use the eland_import_hub_model script with the --hub-model-id set to the directory of the cloned model to install it:

    eland_import_hub_model \
          --url 'XXXX' \
          --hub-model-id /PATH/TO/MODEL \
          --task-type ner \
          --es-username elastic --es-password XXX \
          --es-model-id bert-base-ner

    If you use the Docker image to run eland_import_hub_model you must bind mount the model directory, so the container can read the files:

    docker run --mount type=bind,source=/PATH/TO/MODELS,destination=/models,readonly -it --rm docker.elastic.co/eland/eland \
        eland_import_hub_model \
          --url 'XXXX' \
          --hub-model-id /models/bert-base-NER \
          --task-type ner \
          --es-username elastic --es-password XXX \
          --es-model-id bert-base-ner

    Once it’s uploaded to Elasticsearch, the model will have the ID specified by --es-model-id. If it is not set, the model ID is derived from --hub-model-id; spaces and path delimiters are converted to double underscores __.

Authentication methodsedit

The following authentication options are available when using the import script:

  • Elasticsearch username and password authentication (specified with the -u and -p options):

    eland_import_hub_model -u <username> -p <password> --cloud-id <cloud-id> ...

    These -u and -p options also work when you use --url.

  • Elasticsearch username and password authentication (embedded in the URL):

    eland_import_hub_model --url https://<user>:<password>@<hostname>:<port> ...
  • Elasticsearch API key authentication:

    eland_import_hub_model --es-api-key <api-key> --url https://<hostname>:<port> ...
  • HuggingFace Hub access token (for private models):

    eland_import_hub_model --hub-access-token <access-token> ...