Weighted avg aggregation

edit

A single-value metrics aggregation that computes the weighted average of numeric values that are extracted from the aggregated documents. These values can be extracted either from specific numeric fields in the documents.

When calculating a regular average, each datapoint has an equal "weight" …​ it contributes equally to the final value. Weighted averages, on the other hand, weight each datapoint differently. The amount that each datapoint contributes to the final value is extracted from the document.

As a formula, a weighted average is the ∑(value * weight) / ∑(weight)

A regular average can be thought of as a weighted average where every value has an implicit weight of 1.

Table 51. weighted_avg Parameters

Parameter Name Description Required Default Value

value

The configuration for the field or script that provides the values

Required

weight

The configuration for the field or script that provides the weights

Required

format

The numeric response formatter

Optional

The value and weight objects have per-field specific configuration:

Table 52. value Parameters

Parameter Name Description Required Default Value

field

The field that values should be extracted from

Required

missing

A value to use if the field is missing entirely

Optional

Table 53. weight Parameters

Parameter Name Description Required Default Value

field

The field that weights should be extracted from

Required

missing

A weight to use if the field is missing entirely

Optional

Examples

edit

If our documents have a "grade" field that holds a 0-100 numeric score, and a "weight" field which holds an arbitrary numeric weight, we can calculate the weighted average using:

resp = client.search(
    index="exams",
    size=0,
    aggs={
        "weighted_grade": {
            "weighted_avg": {
                "value": {
                    "field": "grade"
                },
                "weight": {
                    "field": "weight"
                }
            }
        }
    },
)
print(resp)
response = client.search(
  index: 'exams',
  body: {
    size: 0,
    aggregations: {
      weighted_grade: {
        weighted_avg: {
          value: {
            field: 'grade'
          },
          weight: {
            field: 'weight'
          }
        }
      }
    }
  }
)
puts response
const response = await client.search({
  index: "exams",
  size: 0,
  aggs: {
    weighted_grade: {
      weighted_avg: {
        value: {
          field: "grade",
        },
        weight: {
          field: "weight",
        },
      },
    },
  },
});
console.log(response);
POST /exams/_search
{
  "size": 0,
  "aggs": {
    "weighted_grade": {
      "weighted_avg": {
        "value": {
          "field": "grade"
        },
        "weight": {
          "field": "weight"
        }
      }
    }
  }
}

Which yields a response like:

{
  ...
  "aggregations": {
    "weighted_grade": {
      "value": 70.0
    }
  }
}

While multiple values-per-field are allowed, only one weight is allowed. If the aggregation encounters a document that has more than one weight (e.g. the weight field is a multi-valued field) it will abort the search. If you have this situation, you should build a Runtime field to combine those values into a single weight.

This single weight will be applied independently to each value extracted from the value field.

This example show how a single document with multiple values will be averaged with a single weight:

resp = client.index(
    index="exams",
    refresh=True,
    document={
        "grade": [
            1,
            2,
            3
        ],
        "weight": 2
    },
)
print(resp)

resp1 = client.search(
    index="exams",
    size=0,
    aggs={
        "weighted_grade": {
            "weighted_avg": {
                "value": {
                    "field": "grade"
                },
                "weight": {
                    "field": "weight"
                }
            }
        }
    },
)
print(resp1)
response = client.index(
  index: 'exams',
  refresh: true,
  body: {
    grade: [
      1,
      2,
      3
    ],
    weight: 2
  }
)
puts response

response = client.search(
  index: 'exams',
  body: {
    size: 0,
    aggregations: {
      weighted_grade: {
        weighted_avg: {
          value: {
            field: 'grade'
          },
          weight: {
            field: 'weight'
          }
        }
      }
    }
  }
)
puts response
const response = await client.index({
  index: "exams",
  refresh: "true",
  document: {
    grade: [1, 2, 3],
    weight: 2,
  },
});
console.log(response);

const response1 = await client.search({
  index: "exams",
  size: 0,
  aggs: {
    weighted_grade: {
      weighted_avg: {
        value: {
          field: "grade",
        },
        weight: {
          field: "weight",
        },
      },
    },
  },
});
console.log(response1);
POST /exams/_doc?refresh
{
  "grade": [1, 2, 3],
  "weight": 2
}

POST /exams/_search
{
  "size": 0,
  "aggs": {
    "weighted_grade": {
      "weighted_avg": {
        "value": {
          "field": "grade"
        },
        "weight": {
          "field": "weight"
        }
      }
    }
  }
}

The three values (1, 2, and 3) will be included as independent values, all with the weight of 2:

{
  ...
  "aggregations": {
    "weighted_grade": {
      "value": 2.0
    }
  }
}

The aggregation returns 2.0 as the result, which matches what we would expect when calculating by hand: ((1*2) + (2*2) + (3*2)) / (2+2+2) == 2

Runtime field

edit

If you have to sum or weigh values that don’t quite line up with the indexed values, run the aggregation on a runtime field.

resp = client.index(
    index="exams",
    refresh=True,
    document={
        "grade": 100,
        "weight": [
            2,
            3
        ]
    },
)
print(resp)

resp1 = client.index(
    index="exams",
    refresh=True,
    document={
        "grade": 80,
        "weight": 3
    },
)
print(resp1)

resp2 = client.search(
    index="exams",
    filter_path="aggregations",
    size=0,
    runtime_mappings={
        "weight.combined": {
            "type": "double",
            "script": "\n        double s = 0;\n        for (double w : doc['weight']) {\n          s += w;\n        }\n        emit(s);\n      "
        }
    },
    aggs={
        "weighted_grade": {
            "weighted_avg": {
                "value": {
                    "script": "doc.grade.value + 1"
                },
                "weight": {
                    "field": "weight.combined"
                }
            }
        }
    },
)
print(resp2)
response = client.index(
  index: 'exams',
  refresh: true,
  body: {
    grade: 100,
    weight: [
      2,
      3
    ]
  }
)
puts response

response = client.index(
  index: 'exams',
  refresh: true,
  body: {
    grade: 80,
    weight: 3
  }
)
puts response

response = client.search(
  index: 'exams',
  filter_path: 'aggregations',
  body: {
    size: 0,
    runtime_mappings: {
      'weight.combined' => {
        type: 'double',
        script: "\n        double s = 0;\n        for (double w : doc['weight']) {\n          s += w;\n        }\n        emit(s);\n      "
      }
    },
    aggregations: {
      weighted_grade: {
        weighted_avg: {
          value: {
            script: 'doc.grade.value + 1'
          },
          weight: {
            field: 'weight.combined'
          }
        }
      }
    }
  }
)
puts response
const response = await client.index({
  index: "exams",
  refresh: "true",
  document: {
    grade: 100,
    weight: [2, 3],
  },
});
console.log(response);

const response1 = await client.index({
  index: "exams",
  refresh: "true",
  document: {
    grade: 80,
    weight: 3,
  },
});
console.log(response1);

const response2 = await client.search({
  index: "exams",
  filter_path: "aggregations",
  size: 0,
  runtime_mappings: {
    "weight.combined": {
      type: "double",
      script:
        "\n        double s = 0;\n        for (double w : doc['weight']) {\n          s += w;\n        }\n        emit(s);\n      ",
    },
  },
  aggs: {
    weighted_grade: {
      weighted_avg: {
        value: {
          script: "doc.grade.value + 1",
        },
        weight: {
          field: "weight.combined",
        },
      },
    },
  },
});
console.log(response2);
POST /exams/_doc?refresh
{
  "grade": 100,
  "weight": [2, 3]
}
POST /exams/_doc?refresh
{
  "grade": 80,
  "weight": 3
}

POST /exams/_search?filter_path=aggregations
{
  "size": 0,
  "runtime_mappings": {
    "weight.combined": {
      "type": "double",
      "script": """
        double s = 0;
        for (double w : doc['weight']) {
          s += w;
        }
        emit(s);
      """
    }
  },
  "aggs": {
    "weighted_grade": {
      "weighted_avg": {
        "value": {
          "script": "doc.grade.value + 1"
        },
        "weight": {
          "field": "weight.combined"
        }
      }
    }
  }
}

Which should look like:

{
  "aggregations": {
    "weighted_grade": {
      "value": 93.5
    }
  }
}

Missing values

edit

By default, the aggregation excludes documents with a missing or null value for the value or weight field. Use the missing parameter to specify a default value for these documents instead.

resp = client.search(
    index="exams",
    size=0,
    aggs={
        "weighted_grade": {
            "weighted_avg": {
                "value": {
                    "field": "grade",
                    "missing": 2
                },
                "weight": {
                    "field": "weight",
                    "missing": 3
                }
            }
        }
    },
)
print(resp)
const response = await client.search({
  index: "exams",
  size: 0,
  aggs: {
    weighted_grade: {
      weighted_avg: {
        value: {
          field: "grade",
          missing: 2,
        },
        weight: {
          field: "weight",
          missing: 3,
        },
      },
    },
  },
});
console.log(response);
POST /exams/_search
{
  "size": 0,
  "aggs": {
    "weighted_grade": {
      "weighted_avg": {
        "value": {
          "field": "grade",
          "missing": 2
        },
        "weight": {
          "field": "weight",
          "missing": 3
        }
      }
    }
  }
}