# Elasticsearch Search

`wcs.backend` builds its site search on top of
[collective.elasticsearch](https://github.com/collective/collective.elasticsearch)
and exposes three REST API endpoints aimed at frontend consumption:

- `@es-search` -- the main search endpoint. Behaves like the standard Plone
  `@search`, but additionally returns faceting (aggregations) and search
  suggestions.
- `@raw-search` -- a passthrough for raw Elasticsearch query bodies, with the
  site's security layer applied automatically.
- `@popular-searches` -- a small, configurable list of suggested search terms
  for empty-state UIs.

The search query logic, faceting, suggestions and filters are configured
through the Plone registry (see [Configuration](#configuration)). Frontends
generally only need to send query parameters and render the response.


## `@es-search`

Run a search and receive standard Plone result items plus an `elasticsearch`
block with aggregations and suggestions.

```http
GET /Plone/@es-search?SearchableText=content&portal_type=Document HTTP/1.1
Host: localhost:8080
Accept: application/json
```

Response:

```{literalinclude} ./http-examples/es-search.resp
:language: http
```

### Query parameters

`@es-search` accepts the same query parameters as the standard Plone
`@search` endpoint. The most relevant ones for a frontend:

`SearchableText`
The full-text search term. This is the value that drives relevance scoring and
the suggestion block.

`portal_type`
Restrict results to one or more content types. Repeat the parameter to pass
multiple values (`portal_type=Document&portal_type=News%20Item`).

`path`
Constrain the search to a sub-tree of the site.

`b_start` / `b_size`
Batching / paging, as with the standard `@search` endpoint.

`sort_on` / `sort_order`
Sorting. Without an explicit sort, results come back ordered by Elasticsearch
relevance score.

`fullobjects`
When present, each result item is returned as a full object serialization
instead of the default summary. Highlight snippets (when highlighting is
enabled) are surfaced through the `description` field on both paths.

`use_site_search_settings`
When present, the configured site search query template, global filter and API
filter are applied to the query. Use this for the main site search box so that
the registry-configured relevance logic and filters take effect.

Any registered catalog index can additionally be passed as a query parameter
to filter on it (for example a custom keyword index used as a facet).

### Quoted vs. unquoted searches

The `SearchableText` term is matched using a configurable Elasticsearch query
template. Two templates exist:

- **Regular** (`wcs.backend.search.regular`) -- used for normal terms. Matches
  are OR-combined across `Title`, `Description` and `SearchableText`, with
  `Title` boosted highest.
- **Quoted** (`wcs.backend.search.quoted`) -- used when the term is wrapped in
  double quotes (`SearchableText="exact phrase"`). All words must match
  (`minimum_should_match: 100%`), producing a stricter, phrase-style result.

The frontend does not need to choose a template -- simply pass the user's input
verbatim (including any surrounding quotes) and the backend selects the
appropriate template.

### The `elasticsearch` response block

In addition to the standard `items` / `items_total` keys, the response carries
an `elasticsearch` object:

`aggregations`
The facet buckets computed for the **current** result set (i.e. after the
active facet filters were applied). Each bucket has a `key`, a `doc_count`, and
a human-readable `title` resolved from the content type, the matching
vocabulary, or the topic title where applicable.

`original_aggregations`
The facet buckets computed **without** the active facet filters applied. This
lets the UI keep showing the full set of selectable facet values even after the
user narrows the result set. It is only populated when a facet filter is
actually active.

`suggest`
"Did you mean" style term suggestions derived from the search term, keyed by
the configured suggesters.

In the example response, `aggregations.portal_type` reflects the filtered
counts while `original_aggregations.portal_type` shows the unfiltered counts
(including a `Folder` bucket that the active filter removed). Render facet
controls from `original_aggregations` and the counts/selected state from
`aggregations`.

### Consuming `@es-search`

**JavaScript:**

```javascript
async function search(siteUrl, term, type) {
  const params = new URLSearchParams({
    SearchableText: term,
    use_site_search_settings: '1',
  });
  if (type) params.append('portal_type', type);

  const response = await fetch(`${siteUrl}/@es-search?${params}`, {
    headers: { Accept: 'application/json' },
  });
  const data = await response.json();

  const facets = data.elasticsearch.original_aggregations.portal_type?.buckets ?? [];
  return {
    items: data.items,
    total: data.items_total,
    facets,
    suggestions: data.elasticsearch.suggest,
  };
}

const { items, facets } = await search(
  'http://localhost:8080/Plone',
  'content',
  'Document',
);
```

**Python:**

```python
import requests

response = requests.get(
    'http://localhost:8080/Plone/@es-search',
    params={
        'SearchableText': 'content',
        'portal_type': 'Document',
        'use_site_search_settings': '1',
    },
    headers={'Accept': 'application/json'},
)
data = response.json()
for item in data['items']:
    print(item['title'], item['@id'])

for bucket in data['elasticsearch']['aggregations']['portal_type']['buckets']:
    print(bucket['title'], bucket['doc_count'])
```


## `@raw-search`

Send a raw Elasticsearch query body and get the raw Elasticsearch response
back. This is intended for advanced search UIs that need aggregations, custom
sorting or scoring beyond what `@es-search` exposes.

The endpoint always applies the site's security layer to the submitted query:
the current user's `allowedRolesAndUsers` is added as a mandatory clause, and
inactive content (expired / not yet effective) is filtered out unless the user
holds the corresponding permission. A query body **must** be supplied; a request
without a `query` is rejected.

```http
POST /Plone/@raw-search HTTP/1.1
Host: localhost:8080
Accept: application/json
Content-Type: application/json

{
    "query": {
        "match": {
            "SearchableText": "important content"
        }
    },
    "aggs": {
        "types": {
            "terms": {
                "field": "portal_type"
            }
        }
    },
    "size": 20,
    "sort": ["_score", "sortable_title"]
}
```

Response:

```{literalinclude} ./http-examples/raw-es-search.resp
:language: http
```

### Request body

`query` (required)
A standard Elasticsearch query object. If it is not already a `bool` query, it
is automatically wrapped in `bool.must` so the security clauses can be appended.
Supplying a `bool` query directly lets you control `must` / `should` /
`filter` / `must_not` yourself.

`aggs` (optional)
A standard Elasticsearch aggregations object. The resulting buckets are
returned verbatim under the top-level `aggregations` key of the response.

`size` (optional, default `10`)
Maximum number of hits to return.

`from_` (optional, default `0`)
Offset into the result set, for paging.

`sort` (optional, default `["_score"]`)
A standard Elasticsearch sort specification.

`stored_fields` (optional, default `"*"`)
Which stored fields to return on each hit. Ignored when `_source` is supplied.

`_source` (optional)
A standard Elasticsearch `_source` selector. When present, hits return the
matching `_source` document (as shown in the example response) instead of
stored fields.

### Response shape

The response is the raw Elasticsearch payload:

- `hits.total.value` -- total number of matching documents.
- `hits.hits[]` -- the individual hits, each with `_score` and either
  `_source` or stored fields.
- `aggregations` -- present only when `aggs` was supplied in the request.

Because the response is the native Elasticsearch shape (not Plone search
items), the frontend is responsible for turning hit `_source.path` values into
usable URLs.

### Consuming `@raw-search`

**JavaScript:**

```javascript
async function rawSearch(siteUrl, term) {
  const response = await fetch(`${siteUrl}/@raw-search`, {
    method: 'POST',
    headers: {
      Accept: 'application/json',
      'Content-Type': 'application/json',
    },
    body: JSON.stringify({
      query: { match: { SearchableText: term } },
      aggs: { types: { terms: { field: 'portal_type' } } },
      size: 20,
      sort: ['_score', 'sortable_title'],
    }),
  });
  const data = await response.json();
  return {
    total: data.hits.total.value,
    hits: data.hits.hits,
    types: data.aggregations?.types?.buckets ?? [],
  };
}
```

**Python:**

```python
import requests

response = requests.post(
    'http://localhost:8080/Plone/@raw-search',
    json={
        'query': {'match': {'SearchableText': 'important content'}},
        'aggs': {'types': {'terms': {'field': 'portal_type'}}},
        'size': 20,
        'sort': ['_score', 'sortable_title'],
    },
    headers={'Accept': 'application/json'},
)
data = response.json()
print(data['hits']['total']['value'])
for hit in data['hits']['hits']:
    print(hit['_score'], hit['_source']['path'])
```


## `@popular-searches`

Return a configured list of popular / suggested search terms. The endpoint is
registered on the site root only. Each item is a ready-to-use `@es-search` link
for the term, so the frontend can render the list directly as clickable
suggestions.

```http
GET /Plone/@popular-searches HTTP/1.1
Host: localhost:8080
Accept: application/json
```

Response:

```{literalinclude} ./http-examples/popular-searches.resp
:language: http
```

Each entry exposes:

- `title` -- the search term to display.
- `@id` -- an `@es-search` URL pre-filled with that term as `SearchableText`.

When no popular searches are configured, `items` is an empty array.

### Consuming `@popular-searches`

**JavaScript:**

```javascript
async function popularSearches(siteUrl) {
  const response = await fetch(`${siteUrl}/@popular-searches`, {
    headers: { Accept: 'application/json' },
  });
  const data = await response.json();
  return data.items; // [{ title, '@id' }, ...]
}
```


## German Text Analysis

A custom Elasticsearch mapping registers a German text analyzer (with German
stopword filtering) and applies it to the `SearchableText`, `Title`,
`Description` and `content_title` fields, improving search quality for German
content. This is transparent to API consumers -- no special parameters are
required.


## Configuration

Search behaviour is driven by Plone registry records. These are configured by
site administrators; the values shape what `@es-search` and `@raw-search`
return.

`wcs.backend.search.regular`
Elasticsearch query template for regular (unquoted) `SearchableText` searches.

`wcs.backend.search.quoted`
Elasticsearch query template for quoted (`"exact phrase"`) searches.

`wcs.backend.search.filter`
Global Elasticsearch filter clauses appended to every site search.

`wcs.backend.search.aggregations`
The aggregations (facets) computed for `@es-search`. Drives both the
`aggregations` and `original_aggregations` response blocks.

`wcs.backend.search.suggest`
The suggester configuration used to build the `suggest` response block.

`wcs.backend.search.popular_searches`
The list of terms returned by `@popular-searches`.

`wcs.backend.search.custom_fields`
Additional Elasticsearch field mappings to register on the index.

`wcs.backend.search.api_domains`
Hostnames considered "API domains". Requests arriving from these hosts get the
`api_filter` applied.

`wcs.backend.search.api_filter`
Elasticsearch filter clauses applied only to requests from the configured
`api_domains`. Useful for hiding content from public search while keeping it
visible on the backend domain.

```{note}
The site query template, global filter and API filter are only applied to
`@es-search` when the request includes `use_site_search_settings`. For the
main site search box, always send that parameter so the configured relevance
logic and filters take effect.
```

```{note}
`filter` clauses referencing `allowedRolesAndUsers` are rejected -- security
filtering is managed by the backend and cannot be overridden through the
registry or a raw query.
```