General Scout best practices
For test-type-specific guidance, see UI test best practices and API test best practices.
Scout is deployment-agnostic: write once, run locally and on Elastic Cloud.
- Every suite must have deployment tags. Use tags to target the environments where your tests apply (for example, a feature that only exists in stateful deployments).
- Within a test, avoid relying on configuration, data, or behavior specific to a single deployment. Test logic should produce the same result locally and on Cloud.
- Run your tests against a real Elastic Cloud project before merging to catch environment-specific surprises early. See Run tests on Elastic Cloud for setup instructions.
A test should live in the plugin or package that owns the code it exercises. When writing or reviewing a test, confirm that the scenarios logically belong to the plugin they were added to:
- API tests: the routes under test should be defined in this plugin's
/serverdirectory. - UI tests: the UI being driven should come from this plugin's
/publicdirectory — a quick look there is usually enough to understand what the plugin renders and whether the test fits.
This also keeps Scout's selective testing effective: it runs only the tests for modules affected by a PR, so a test placed in the wrong plugin won't be triggered by changes to the code it actually covers. The full suite still runs post-merge on kibana-on-merge.
When a feature is gated behind a flag, enable it at runtime with apiServices.core.settings() rather than creating a custom server config. Runtime flags work locally and on Cloud, don’t require a server restart, and avoid the CI cost of a dedicated server instance.
For the full guide (including when a custom server config is unavoidable), see Feature flags.
When you add new tests, fix flakes, or make significant changes, run the same tests multiple times to catch flakiness early. A good starting point is 20–50 runs.
Prefer doing this locally first (faster feedback), and use the Flaky Test Runner in CI when needed. See Debug flaky tests for guidance.
- Keep one top-level suite per file (
test.describe). - Avoid nested
describeblocks. Usetest.stepfor structure inside a test. - Don’t rely on test file execution order (it’s not guaranteed).
- Don’t assume a previous test in the suite already set up the data you need (if that test fails or is skipped, the test will break with a misleading error).
Use test.step() (or apiTest.step() in API tests) to structure a multi-step flow within a single test. It keeps the test in one context (faster, clearer reporting) and produces labelled entries in the test report that make failures easier to diagnose. Group closely related actions into a single step when it keeps the report readable without hiding intent.
Example
test('navigates through pages', async ({ pageObjects }) => {
await test.step('go to Dashboards', async () => {
await pageObjects.navigation.clickDashboards();
});
await test.step('go to Overview', async () => {
await pageObjects.navigation.clickOverview();
});
});
The same pattern works in API tests with apiTest.step().
Test names should read like a sentence describing expected behavior. Clear names make failures self-explanatory and test suites scannable.
Examples
❌ Don’t:
test('test 1', async ({ page }) => {
/* ... */
});
test('works correctly', async ({ page }) => {
/* ... */
});
❌ Don’t: use variables or template literals in test titles as they look opaque in stack traces and test reports:
test(`handles ${dataView.title} correctly`, async ({ page }) => {
/* ... */
});
✔️ Do:
test('viewer can see dashboard but cannot edit', async ({ page }) => {
/* ... */
});
test('returns 403 when missing read privilege', async ({ apiClient }) => {
/* ... */
});
Prefer “one role + one flow per file” and keep spec files small (roughly 4–5 short tests or 2–3 longer ones). The test runner balances work at the spec-file level, so oversized files become bottlenecks during parallel execution. Put shared login/navigation in beforeEach.
Example
// dashboard_viewer.spec.ts
test.beforeEach(async ({ browserAuth, pageObjects }) => {
await browserAuth.loginAsViewer();
await pageObjects.dashboard.goto();
});
test('can see dashboard', async ({ page }) => {
// assertions...
});
If many files share the same “one-time” work (archives, API calls, settings), move it to a global setup hook.
Example
globalSetupHook('Load shared test data (if needed)', async ({ esArchiver, log }) => {
log.debug('[setup] loading archives (only if indexes do not exist)...');
await esArchiver.loadIfNeeded(MY_ARCHIVE);
});
Global setup hooks have no corresponding teardown. Keep operations that require cleanup (such as kbnClient.importExport.load()) in beforeAll/afterAll hooks so saved objects are properly removed after tests run. See Global setup hook: When to use for guidance.
It’s common for test suites to load Elasticsearch or Kibana archives that are barely used (or not used at all). Unused archives slow down setup, waste resources, and make it harder to understand what a test actually depends on. Check if your tests ingest the data they actually need.
Use esArchiver.loadIfNeeded(), which skips ingestion if the index already exists (useful when multiple suites share the same data).
loadIfNeeded() checks at the index level, not individual documents. If a test deletes specific documents, subsequent runs or retries won't restore them. Reindex documents that were deleted.
Examples
❌ Don’t: load archives that no test in the suite relies on:
test.beforeAll(async ({ esArchiver }) => {
await esArchiver.loadIfNeeded('large_metrics_archive');
await esArchiver.loadIfNeeded('user_actions_archive');
});
test('shows metrics dashboard', async ({ page }) => {
// only uses large_metrics_archive; user_actions_archive is never referenced
});
✔️ Do: load only what the suite needs:
test.beforeAll(async ({ esArchiver }) => {
await esArchiver.loadIfNeeded('large_metrics_archive');
});
Cleanup in the test body doesn’t run after a failure. Prefer afterEach / afterAll. Don’t duplicate the same teardown in the test body when a hook already runs it; duplication invites unnecessary try/catch and drift between paths.
Examples
❌ Don’t: put cleanup at the end of the test body (it’s skipped if the test fails):
test('creates and deletes index', async ({ esClient }) => {
await esClient.indices.create({ index: testIndexName });
// ... assertions ...
await esClient.indices.delete({ index: testIndexName });
});
- skipped on failure!
✔️ Do: use hooks so cleanup always runs:
test.afterEach(async ({ esClient, log }) => {
try {
await esClient.indices.delete({ index: testIndexName });
} catch (e: any) {
log.debug(`Index cleanup failed: ${e.message}`);
}
});
Tests should be clean and declarative. If a helper might return an expected error (for example, 404 during cleanup), the helper should handle it internally, for example by accepting an ignoreErrors option or treating a 404 during deletion as a success.
Examples
❌ Don’t: catch errors in the test:
test.afterAll(async ({ apiServices }) => {
try {
await apiServices.cases.delete(caseId);
} catch {
// might already be deleted
}
});
✔️ Do: let the helper handle expected errors:
test.afterAll(async ({ apiServices }) => {
await apiServices.cases.cleanup.deleteAllCases();
});
Inside the helper, handle the expected error — either by treating 404 as success, or by accepting an ignoreErrors option:
async function deleteCase(caseId: string, { ignoreErrors = false } = {}) {
const response = await apiClient.delete(`api/cases/${caseId}`, {
headers: { ...COMMON_HEADERS, ...adminCredentials.apiKeyHeader },
});
// already gone — nothing to do
if (response.status === 404) return;
if (!ignoreErrors && response.status >= 400) {
throw new Error(`Failed to delete case ${caseId}: ${response.status}`);
}
}
When a test verifies multiple independent items (KPI tiles, chart counts, table columns, several response fields), you can optionally use expect.soft() so the test continues checking everything instead of stopping at the first failure (to facilitate troubleshooting). Playwright still fails the test at the end if any soft assertion failed.
Examples
UI test:
test('Overview tab shows all KPI values', async ({ pageObjects }) => {
await pageObjects.nodeDetails.clickOverviewTab();
await expect.soft(pageObjects.nodeDetails.getKPI('cpuUsage')).toHaveText('50.0%');
await expect.soft(pageObjects.nodeDetails.getKPI('memoryUsage')).toHaveText('35.0%');
await expect.soft(pageObjects.nodeDetails.getKPI('diskUsage')).toHaveText('80.0%');
});
API test:
apiTest('returns expected summary fields', async ({ apiClient }) => {
const response = await apiClient.get('api/my-feature/summary', {
headers: { ...COMMON_HEADERS, ...viewerCredentials.apiKeyHeader },
});
expect(response).toHaveStatusCode(200);
expect.soft(response.body.total).toBe(42);
expect.soft(response.body.active).toBe(10);
expect.soft(response.body.archived).toBe(32);
});
If a value is reused across suites (archive paths, fixed time ranges, endpoints, common headers), extract it into a shared constants.ts file. This reduces duplication and typos, and makes updates safer.
Example
// test/scout/ui/constants.ts
export const LENS_BASIC_TIME_RANGE = {
from: 'Sep 22, 2015 @ 00:00:00.000',
to: 'Sep 23, 2015 @ 00:00:00.000',
};
export const DASHBOARD_SAVED_SEARCH_ARCHIVE =
'src/platform/test/functional/fixtures/kbn_archiver/dashboard/current/kibana';
export const DASHBOARD_DEFAULT_INDEX_TITLE = 'logstash-*';
// test/scout/api/constants.ts
export const COMMON_HEADERS = {
'kbn-xsrf': 'some-xsrf-token',
'x-elastic-internal-origin': 'kibana',
'Content-Type': 'application/json;charset=UTF-8',
} as const;
Avoid admin unless there’s no alternative. Minimal permissions catch real permission bugs and keep tests realistic. Also test the forbidden path: verify that an under-privileged role receives 403 for endpoints it shouldn’t access.
See browser authentication and API authentication.
Examples
❌ Don’t: default to admin for convenience:
test.beforeEach(async ({ browserAuth }) => {
await browserAuth.loginAsAdmin();
});
✔️ Do: use a built-in role when it fits (viewer, editor, etc.), or create a custom one for tighter scoping:
// built-in role
await browserAuth.loginAsViewer();
// custom role for finer-grained control
await browserAuth.loginWithCustomRole('logs_analyst', {
elasticsearch: {
indices: [{ names: ['logs-*'], privileges: ['read'] }],
},
kibana: [{ spaces: ['*'], base: [], feature: { discover: ['read'] } }],
});
If you build a helper that will benefit other tests, consider upstreaming it:
- Reusable across many plugins/teams: contribute to
@kbn/scout - Reusable but solution-scoped: contribute to the relevant solution Scout package
- Plugin-specific: keep it in your plugin’s
test/scouttree
For the full guidance, see Scout.
When you move a helper into @kbn/scout or a solution Scout package, don't import types from plugins or plugin-scoped packages. Scout packages are intentionally slim, shared infrastructure — adding a dependency on a specific plugin's types pulls that plugin into every consumer and breaks the sharing model.