A newer version is available. For the latest information, see the
current release documentation.
Significant Text Aggregation Usageedit
An aggregation that returns interesting or unusual occurrences of free-text terms in a set. It is like the significant terms aggregation but differs in that:
-
It is specifically designed for use on type
text
fields - It does not require field data or doc-values
- It re-analyzes text content on-the-fly meaning it can also filter duplicate sections of noisy text that otherwise tend to skew statistics.
Re-analyzing large result sets will require a lot of time and memory. It is recommended that the significant_text aggregation is used as a child of either the sampler or diversified sampler aggregation to limit the analysis to a small selection of top-matching documents e.g. 200. This will typically improve speed, memory use and quality of results.
See the Elasticsearch documentation on significant text aggregation for more detail.
Fluent DSL exampleedit
a => a .SignificantText("significant_descriptions", st => st .Field(p => p.Description) .FilterDuplicateText() )
Object Initializer syntax exampleedit
new SignificantTextAggregation("significant_descriptions") { Field = Infer.Field<Project>(p => p.Description), FilterDuplicateText = true }
Example json output.
{ "significant_descriptions": { "significant_text": { "field": "description", "filter_duplicate_text": true } } }
Handling Responsesedit
response.ShouldBeValid(); var sigNames = response.Aggregations.SignificantText("significant_descriptions"); sigNames.Should().NotBeNull(); sigNames.DocCount.Should().BeGreaterThan(0); foreach (var bucket in sigNames.Buckets) { bucket.Key.Should().NotBeNullOrEmpty(); bucket.BgCount.Should().BeGreaterThan(0); bucket.DocCount.Should().BeGreaterThan(0); bucket.Score.Should().BeGreaterThan(0); }