Machine Learningedit

Trained modelsedit

Eland allows transforming trained models from scikit-learn, XGBoost, and LightGBM libraries to be serialized and used as an inference model in Elasticsearch.

>>> from xgboost import XGBClassifier
>>> from eland.ml import MLModel

# Train and exercise an XGBoost ML model locally
>>> xgb_model = XGBClassifier(booster="gbtree")
>>> xgb_model.fit(training_data[0], training_data[1])

>>> xgb_model.predict(training_data[0])
[0 1 1 0 1 0 0 0 1 0]

# Import the model into Elasticsearch
>>> es_model = MLModel.import_model(
    es_client="http://localhost:9200",
    model_id="xgb-classifier",
    model=xgb_model,
    feature_names=["f0", "f1", "f2", "f3", "f4"],
)

# Exercise the ML model in Elasticsearch with the training data
>>> es_model.predict(training_data[0])
[0 1 1 0 1 0 0 0 1 0]

Natural language processing (NLP) with PyTorchedit

You need to use PyTorch 1.11.0 or earlier to import an NLP model. Run pip install torch==1.11 to install the aproppriate version of PyTorch.

For NLP tasks, Eland enables you to import PyTorch trained BERT models into Elasticsearch. Models can be either plain PyTorch models, or supported transformers models from the Hugging Face model hub. For example:

$ eland_import_hub_model <authentication> \ 
  --url http://localhost:9200/ \ 
  --hub-model-id elastic/distilbert-base-cased-finetuned-conll03-english \ 
  --task-type ner \ 
  --start

Use an authentication method to access your cluster. Refer to Authentication methods.

The cluster URL. Alternatively, use --cloud-id.

Specify the identifier for the model in the Hugging Face model hub.

Specify the type of NLP task. Supported values are fill_mask, ner, question_answering, text_classification, text_embedding, and zero_shot_classification.

>>> import elasticsearch
>>> from pathlib import Path
>>> from eland.ml.pytorch import PyTorchModel
>>> from eland.ml.pytorch.transformers import TransformerModel

# Load a Hugging Face transformers model directly from the model hub
>>> tm = TransformerModel("elastic/distilbert-base-cased-finetuned-conll03-english", "ner")
Downloading: 100%|██████████| 257/257 [00:00<00:00, 108kB/s]
Downloading: 100%|██████████| 954/954 [00:00<00:00, 372kB/s]
Downloading: 100%|██████████| 208k/208k [00:00<00:00, 668kB/s]
Downloading: 100%|██████████| 112/112 [00:00<00:00, 43.9kB/s]
Downloading: 100%|██████████| 249M/249M [00:23<00:00, 11.2MB/s]

# Export the model in a TorchScript representation which Elasticsearch uses
>>> tmp_path = "models"
>>> Path(tmp_path).mkdir(parents=True, exist_ok=True)
>>> model_path, config, vocab_path = tm.save(tmp_path)

# Import model into Elasticsearch
>>> es = elasticsearch.Elasticsearch("http://elastic:mlqa_admin@localhost:9200", timeout=300)  # 5 minute timeout
>>> ptm = PyTorchModel(es, tm.elasticsearch_model_id())
>>> ptm.import_model(model_path=model_path, config_path=None, vocab_path=vocab_path, config=config)
100%|██████████| 63/63 [00:12<00:00,  5.02it/s]

Import model with Dockeredit

To use the Docker container, you need to clone the Eland repository: https://github.com/elastic/eland

If you want to use Eland without installing it, you can build the Docker container to run the available scripts:

$ docker build -t elastic/eland .

You can now use the container interactively:

$ docker run -it --rm --network host elastic/eland

Running installed scripts is also possible without an interactive shell, for example:

docker run -it --rm elastic/eland \
    eland_import_hub_model \
      --url $ELASTICSEARCH_URL \
      --hub-model-id elastic/distilbert-base-uncased-finetuned-conll03-english \
      --start

Replace the $ELASTICSEARCH_URL with the URL for your Elasticsearch cluster. For authentication purposes, include an administrator username and password in the URL in the following format: https://username:password@host:port.

Authentication methodsedit

The following authentication options are available when using the import script:

  • username and password authentication (specified with the -u and -p options):

    eland_import_hub_model -u <username> -p <password> --cloud-id <cloud-id> ...

    These -u and -p options also work when you use --url.

  • username and password authentication (embedded in the URL):

    eland_import_hub_model --url https://<user>:<password>@<hostname>:<port> ...
  • API key authentication:

    eland_import_hub_model --es-api-key <api-key> --url https://<hostname>:<port> ...