elasticsearch-security-troubleshooting
Elasticsearch Security Troubleshooting
Diagnose and resolve common Elasticsearch security issues. This skill provides a structured triage workflow for authentication failures, authorization errors, TLS problems, API key issues, role mapping mismatches, Kibana login failures, and license-expiry lockouts.
For authentication methods and API key management, see the elasticsearch-authn skill. For roles, users, and role mappings, see the elasticsearch-authz skill. For license management, see the elasticsearch-license skill.
For diagnostic API endpoints, see references/api-reference.md.
Deployment note: Diagnostic API availability differs between self-managed, ECH, and Serverless. See Deployment Compatibility for details.
Jobs to Be Done
- Diagnose HTTP 401 authentication failures
- Diagnose HTTP 403 permission denied errors
- Troubleshoot TLS/SSL handshake or certificate errors
- Investigate expired or invalid API keys
- Debug role mappings that do not grant expected roles
- Fix Kibana login failures, redirect loops, or CORS errors
- Recover from a license-expiry lockout
- Determine why a user lacks access to a specific index
Prerequisites
| Item | Description |
|---|---|
| Elasticsearch URL | Cluster endpoint (e.g. https://localhost:9200 or a Cloud deployment URL) |
| Authentication | Any valid credentials — even minimal — to reach the cluster |
| Cluster privileges | monitor for read-only diagnostics; manage_security for fixes |
Prompt the user for any missing values. If the user cannot authenticate at all, start with TLS and Certificate Errors or License Expiry Recovery.
Diagnostic Workflow
Route the symptom to the correct section:
| Symptom | Section |
|---|---|
HTTP 401, authentication_exception |
Authentication Failures |
HTTP 403, security_exception, access denied |
Authorization Failures |
| SSL/TLS handshake error, certificate rejected | TLS and Certificate Errors |
| API key rejected, expired, or ineffective | API Key Issues |
| Role mapping not granting expected roles | Role Mapping Issues |
| Kibana login broken, redirect loop, CORS error | Kibana Authentication Issues |
| All users locked out, paid features disabled | License Expiry Recovery |
Each section follows a Gather - Diagnose - Resolve pattern.
Diagnostic Toolkit
Use these APIs at the start of any security investigation:
curl <auth_flags> "${ELASTICSEARCH_URL}/_security/_authenticate"
Confirms identity, realm, and roles. If this fails with 401, the problem is authentication.
curl <auth_flags> "${ELASTICSEARCH_URL}/_xpack"
Confirms whether security is enabled (features.security.enabled). If security is disabled, all security APIs return
errors.
curl -X POST "${ELASTICSEARCH_URL}/_security/user/_has_privileges" \
<auth_flags> \
-H "Content-Type: application/json" \
-d '{
"index": [
{ "names": ["'"${INDEX_PATTERN}"'"], "privileges": ["read"] }
]
}'
Tests whether the authenticated user holds specific privileges without requiring manage_security.
curl <auth_flags> "${ELASTICSEARCH_URL}/_license"
Check license type and status. An expired paid license disables paid realms and features.
Authentication Failures (401)
A 401 response means Elasticsearch could not verify the caller's identity.
Gather
curl -v <auth_flags> "${ELASTICSEARCH_URL}/_security/_authenticate" 2>&1
The -v flag shows headers and the response body. Look for:
WWW-Authenticateheader — indicates which auth schemes the cluster accepts.authentication_exceptionin the response body — thereasonfield describes what failed.
Diagnose
| Symptom | Likely cause |
|---|---|
unable to authenticate user |
Wrong username or password |
unable to authenticate with provided credentials |
Credentials do not match any realm in the chain |
user is not enabled |
The native user account is disabled |
token is expired |
API key or bearer token has expired |
No WWW-Authenticate header |
Security may be disabled; check GET /_xpack |
If the user authenticates via an external realm (LDAP, AD, SAML, OIDC), the realm chain order matters. Elasticsearch tries realms in configured order and stops at the first match. If a higher-priority realm rejects the credentials before the intended realm is reached, authentication fails.
Resolve
| Cause | Action |
|---|---|
| Wrong credentials | Verify username/password or API key value. See elasticsearch-authn. |
| Disabled user | PUT /_security/user/{name}/_enable. See elasticsearch-authz. |
| Expired API key | Create a new API key. See API Key Issues. |
| Realm chain order | Check elasticsearch.yml realm order (self-managed only). |
| Security disabled | Enable xpack.security.enabled: true in elasticsearch.yml and restart. |
| Paid realm after expiry | License expired — see License Expiry Recovery. |
Authorization Failures (403)
A 403 response means the user is authenticated but lacks the required privileges.
Gather
Test the specific privileges the operation requires:
curl -X POST "${ELASTICSEARCH_URL}/_security/user/_has_privileges" \
<auth_flags> \
-H "Content-Type: application/json" \
-d '{
"index": [
{ "names": ["logs-*"], "privileges": ["read", "view_index_metadata"] }
],
"cluster": ["monitor"]
}'
The response contains a has_all_requested boolean and per-resource breakdowns.
Also check the user's effective roles:
curl <auth_flags> "${ELASTICSEARCH_URL}/_security/_authenticate"
Inspect the roles array and authentication_realm to confirm the user is who you expect.
Diagnose
| Symptom | Likely cause |
|---|---|
has_all_requested: false for an index |
Role is missing the required index privilege |
has_all_requested: false for a cluster |
Role is missing the required cluster privilege |
| User has fewer roles than expected | Roles array was replaced (not merged) on last update |
| API key returns 403 on previously allowed | API key privileges are a snapshot — role changes after |
| operation | creation do not propagate to existing keys |
Resolve
| Cause | Action |
|---|---|
| Missing index privilege | Add the privilege to the role or create a new role. See elasticsearch-authz. |
| Missing cluster privilege | Add the cluster privilege. See elasticsearch-authz. |
| Roles replaced on update | Fetch current roles first, then update with the full array. See elasticsearch-authz. |
| Stale API key privileges | Create a new API key with updated role_descriptors. See elasticsearch-authn. |
TLS and Certificate Errors
TLS errors prevent the client from establishing a connection at all.
Gather
curl -v --cacert "${CA_CERT}" "https://${ELASTICSEARCH_HOST}:9200/" 2>&1 | head -30
Look for:
SSL certificate problem: unable to get local issuer certificate— CA not trusted.SSL certificate problem: certificate has expired— certificate past its validity date.SSL: no alternative certificate subject name matches target host name— hostname mismatch.
For deeper inspection (self-managed only):
openssl s_client -connect "${ELASTICSEARCH_HOST}:9200" -showcerts </dev/null 2>&1
This displays the full certificate chain, expiry dates, and subject alternative names.
Diagnose
| Error message | Likely cause |
|---|---|
unable to get local issuer certificate |
Missing or wrong CA certificate |
certificate has expired |
Server or CA certificate past expiry |
no alternative certificate subject name matches |
Certificate SAN does not include the hostname |
self-signed certificate |
Self-signed cert not in the trust store |
SSLHandshakeException (Java client) |
Truststore missing the CA or wrong password |
Resolve
| Cause | Action |
|---|---|
| Wrong CA cert | Pass the correct CA with --cacert or add it to the system trust store. |
| Expired certificate | Regenerate certificates with elasticsearch-certutil (self-managed). |
| Hostname mismatch | Regenerate the certificate with the correct SAN entries. |
| Self-signed cert | Distribute the CA cert to all clients or use a publicly trusted CA. |
| Quick workaround | Use curl -k / --insecure to skip verification. Not for production. |
On ECH, TLS is managed by Elastic — certificate errors usually indicate the client is not using the correct Cloud endpoint URL. On Serverless, TLS is fully managed and transparent.
API Key Issues
Gather
Retrieve the key's metadata:
curl "${ELASTICSEARCH_URL}/_security/api_key?name=${KEY_NAME}" <auth_flags>
Check expiration, invalidated, and role_descriptors in the response.
Diagnose
| Symptom | Likely cause |
|---|---|
| 401 when using the key | Key expired or invalidated |
| 403 on operations that should be allowed | Key was created with insufficient role_descriptors |
| Derived key has no access | API key created another API key — derived keys have no privilege |
| Key works for some indices but not others | role_descriptors scope is too narrow |
Resolve
| Cause | Action |
|---|---|
| Expired key | Create a new key with appropriate expiration. See elasticsearch-authn. |
| Invalidated key | Create a new key. Invalidated keys cannot be reinstated. |
| Wrong scope | Create a new key with correct role_descriptors. See elasticsearch-authn. |
| Derived key problem | Use POST /_security/api_key/grant with user credentials instead. See elasticsearch-authn. |
Role Mapping Issues
Role mappings grant roles to users from external realms. When they fail silently, users authenticate but get no roles.
Gather
curl <auth_flags> "${ELASTICSEARCH_URL}/_security/_authenticate"
Note the username, authentication_realm.name, and roles array.
curl <auth_flags> "${ELASTICSEARCH_URL}/_security/role_mapping"
List all mappings and inspect their rules and enabled fields.
Diagnose
| Symptom | Likely cause |
|---|---|
User has empty roles array |
No mapping matches the user's attributes |
| User gets wrong roles | A different mapping matched first or the rule is too broad |
| Mapping exists but does not apply | enabled is false |
| Mustache template produces wrong role name | Template syntax error or unexpected attribute value |
Compare the user's authentication_realm.name and groups (from _authenticate) against each mapping's rules to
find the mismatch.
Resolve
| Cause | Action |
|---|---|
| No matching rule | Update the mapping rules to match the user's realm and attributes. |
| Mapping disabled | Set "enabled": true on the mapping. |
| Template error | Test the Mustache template with known attribute values. See elasticsearch-authz. |
| Rule too broad | Add all / except conditions to narrow the match. See elasticsearch-authz. |
Kibana Authentication Issues
Missing kbn-xsrf header
All mutating Kibana API requests require the kbn-xsrf header:
curl -X PUT "${KIBANA_URL}/api/security/role/my-role" \
<auth_flags> \
-H "kbn-xsrf: true" \
-H "Content-Type: application/json" \
-d '{ ... }'
Without it, Kibana returns 400 Bad Request with "Request must contain a kbn-xsrf header".
SAML/OIDC redirect loop
Common causes:
- Incorrect
xpack.security.authc.realms.saml.*.sp.acsoridp.metadata.pathinelasticsearch.yml. - Clock skew between the IdP and Elasticsearch nodes (SAML assertions have a validity window).
- Kibana
server.publicBaseUrldoes not match the SAML ACS URL.
Verify the SAML realm configuration:
curl <auth_flags> "${ELASTICSEARCH_URL}/_security/_authenticate"
If this returns a valid user via a non-SAML realm, the SAML realm itself is not being reached. Check realm chain order.
Kibana cannot reach Elasticsearch
Kibana logs Unable to retrieve version information from Elasticsearch nodes. Verify the elasticsearch.hosts setting
in kibana.yml points to a reachable endpoint and the credentials (elasticsearch.username / elasticsearch.password
or elasticsearch.serviceAccountToken) are valid.
License Expiry Recovery
When a paid license expires, the cluster enters a security-closed state: paid realms (SAML, LDAP, AD, PKI) stop working and users authenticating through them are locked out. Native and file realms remain functional.
Quick triage
curl <auth_flags> "${ELASTICSEARCH_URL}/_license"
If license.status is "expired", proceed with recovery.
Recovery steps
Follow the detailed recovery workflow in the elasticsearch-license skill. The critical first step depends on deployment type:
| Deployment | First step |
|---|---|
| Self-managed | Log in with a file-based user (elasticsearch-users CLI) or native user. |
| ECH | Contact Elastic support or renew via the Cloud console. |
| Serverless | Not applicable — licensing is fully managed by Elastic. |
Examples
User gets 403 when querying logs
Symptom: "I get a 403 when searching logs-*."
- Verify identity:
curl -u "joe:${PASSWORD}" "${ELASTICSEARCH_URL}/_security/_authenticate"
Response shows "roles": ["viewer"].
- Test privileges:
curl -X POST "${ELASTICSEARCH_URL}/_security/user/_has_privileges" \
-u "joe:${PASSWORD}" \
-H "Content-Type: application/json" \
-d '{"index": [{"names": ["logs-*"], "privileges": ["read"]}]}'
Response: "has_all_requested": false — the viewer role does not include read on logs-*.
- Fix: create a
logs-readerrole and assign it to Joe. See elasticsearch-authz.
API key stopped working
Symptom: "My API key returns 401 since yesterday."
- Check the key:
curl -u "admin:${PASSWORD}" "${ELASTICSEARCH_URL}/_security/api_key?name=my-key"
Response shows "expiration": 1709251200000 — the key expired.
- Fix: create a new API key with a suitable
expiration. See elasticsearch-authn.
SAML login redirects to error
Symptom: "Clicking the SSO button in Kibana redirects to an error page."
- Check if the SAML realm is reachable by authenticating with a non-SAML method:
curl -u "elastic:${PASSWORD}" "${ELASTICSEARCH_URL}/_security/_authenticate"
- Verify the IdP metadata URL is accessible from the Elasticsearch nodes (self-managed):
curl -s "${IDP_METADATA_URL}" | head -5
- Check for clock skew — SAML assertions are time-sensitive. Ensure NTP is configured on all nodes.
- Verify
server.publicBaseUrlinkibana.ymlmatches the SAML ACS URL configured in the IdP.
Users locked out after license expired
Symptom: "Nobody can log in to Kibana. We use SAML."
- Check license:
curl -u "admin:${PASSWORD}" "${ELASTICSEARCH_URL}/_license"
Response shows "status": "expired", "type": "platinum".
- The SAML realm is disabled because the paid license expired. Follow the recovery steps in elasticsearch-license: log in with a file-based or native user, then upload a renewed license or revert to basic.
Guidelines
Always start with _authenticate
Run GET /_security/_authenticate as the first diagnostic step. It reveals the user's identity, realm, roles, and
authentication type in a single call. Most issues become apparent from this response alone.
Check the license early
Before investigating realm or privilege issues, verify the license is active with GET /_license. An expired paid
license disables realms and features, producing symptoms that mimic misconfiguration.
Use _has_privileges before manual inspection
Instead of reading role definitions and mentally computing effective access, use POST /_security/user/_has_privileges
to test specific privileges directly. This is faster and accounts for role composition, DLS, and FLS.
Avoid superuser credentials
Never use the built-in elastic superuser for day-to-day troubleshooting. Create a dedicated admin user or API key with
manage_security privileges. Reserve the elastic user for initial setup and emergency recovery only.
Do not bypass TLS in production
Using curl -k or --insecure skips certificate verification and masks real TLS issues. Use it only for initial
diagnosis, then fix the underlying certificate problem.
Deployment Compatibility
Diagnostic tool and API availability differs across deployment types.
| Tool / API | Self-managed | ECH | Serverless |
|---|---|---|---|
_security/_authenticate |
Yes | Yes | Yes |
_security/user/_has_privileges |
Yes | Yes | Yes |
_xpack |
Yes | Yes | Limited |
_license |
Yes | Yes (read) | Not available |
_security/api_key (GET) |
Yes | Yes | Yes |
_security/role_mapping |
Yes | Yes | Yes |
elasticsearch-users CLI |
Yes | Not available | Not available |
openssl s_client on nodes |
Yes | Not available | Not available |
| Elasticsearch logs | Yes | Via Cloud UI | Via Cloud UI |
ECH notes:
- No node-level access, so the
elasticsearch-usersCLI and direct log/certificate inspection are not available. - TLS is managed by Elastic — certificate errors typically indicate an incorrect endpoint URL.
- Use the Cloud console for log inspection and deployment configuration.
Serverless notes:
- Licensing APIs are not exposed. License-related lockouts do not occur.
- Native users do not exist — authentication issues are handled at the organization level.
- TLS is fully managed and transparent.