NOTE: You are looking at documentation for an older release. For the latest information, see the current release documentation.
Significant Text Aggregation Usageedit
An aggregation that returns interesting or unusual occurrences of free-text terms in a set. It is like the significant terms aggregation but differs in that:
-
It is specifically designed for use on type
text
fields - It does not require field data or doc-values
- It re-analyzes text content on-the-fly meaning it can also filter duplicate sections of noisy text that otherwise tend to skew statistics.
Re-analyzing large result sets will require a lot of time and memory. It is recommended that the significant_text aggregation is used as a child of either the sampler or diversified sampler aggregation to limit the analysis to a small selection of top-matching documents e.g. 200. This will typically improve speed, memory use and quality of results.
See the Elasticsearch documentation on significant text aggregation for more detail.
Fluent DSL exampleedit
a => a .SignificantText("significant_descriptions", st => st .Field(p => p.Description) .FilterDuplicateText() )
Object Initializer syntax exampleedit
new SignificantTextAggregation("significant_descriptions") { Field = Infer.Field<Project>(p => p.Description), FilterDuplicateText = true }
Example json output.
{ "significant_descriptions": { "significant_text": { "field": "description", "filter_duplicate_text": true } } }
Handling Responsesedit
response.ShouldBeValid(); var sigNames = response.Aggregations.SignificantText("significant_descriptions"); sigNames.Should().NotBeNull(); sigNames.DocCount.Should().BeGreaterThan(0); foreach (var bucket in sigNames.Buckets) { bucket.Key.Should().NotBeNullOrEmpty(); bucket.BgCount.Should().BeGreaterThan(0); bucket.DocCount.Should().BeGreaterThan(0); bucket.Score.Should().BeGreaterThan(0); }