HED search guide

Many analysis methods locate event markers with specified properties and extract sections of the data surrounding these markers for analysis. This extraction process is called epoching or trial selection.

Analysis may also exclude data surrounding particular event markers.

Other approaches find sections of the data with particular signal characteristics and then determine which types of event markers are more likely to be associated with data sections having these characteristics.

At a more global level, analysts may want to locate datasets whose event markers have certain properties in choosing data for initial analysis or for comparisons with their own data.

HED search basics

Datasets whose event markers are annotated with HED (Hierarchical Event Descriptors) can be searched in a dataset independent manner. The HED search facility has been implemented in the Python HEDTools library, an open source Python library. The latest versions are available in the hed-python GitHub repository.

To perform a query using HEDTools, users create a query object containing the parsed query. Once created, this query object can then be applied to any number of HED annotations – say to the annotations for each event-marker associated with a data recording.

The query object returns a list of matches within the annotation. Usually, users just test whether this list is empty to determine if the query was satisfied.

Calling syntax

To perform a search, create a TagExpressionParser object, which parses the query. Once created, this query object can be applied to search multiple HED annotations. The syntax is demonstrated in the following example:

In the example the strings containing HED annotations are converted to a HedString object, which is a parsed representation of the HED annotation. The query facility assumes that the annotations have been validated. A HedSchema is required. In the example standard schema version 8.1.0 is loaded. The schemas are available on GitHub.

The query is represented by a QueryParser object. The search method returns a list of groups in the HED string that match the query. This return list can be quite complex and usually must be filtered before being used directly. In many applications, we are not interested in the exact groups, but just whether the query was satisfied. In the above example, the result is treated as a boolean value.

Warning

  • If you are searching many strings for the same expression, be sure to create the QueryParser only once.

  • The current search facility is case-insensitive.

Single tag queries

The simplest type of query is to search for the presence of absence of a single tag. HED offers four variations on the single tag query as summarized in the following table.

Query type

Example query

Matches

Does not match

Single-term
Match the term or any child.
Don’t consider values or
extensions when matching.

Agent-trait

Agent-trait
Age
Age/35
Right-handed
Agent-trait/Glasses
Agent-property/Agent-trait
(Age, Blue)

Agent-property

Quoted-tag
Match the exact tag with
extension or value

Age

Age
Agent-trait/Age

Age/35

Age/34

Age/34
Agent-trait/Age/34

Age/35

Tag-path with slash
Match the exact tag with
extension or value

Age/34

Age/34

Age
Age/35
Agent-trait/Age/34

Tag-prefix with wildcard
Match the starting portion
of a tag and possibly its
value or extension.

Age/3*

Age/34
Age/3
Agent-trait/Age/34

Age
Age/40

The meanings of the different queries are explained in the following subsections.

Tag-path with slash

If the query includes a slash in the tag path, then the query must match the exact value with the slash. Thus, Age/34 does not match Age or Age/35. The query matches Agent-trait/Age/34 because the short-form of this tag-path is Age/34. The tag short forms are used for the matching to assure consistency.

Tag-prefix with wildcard

Matching using a tag prefix with the * wildcard, matches the starting portion of the tag. Thus, Age/3* matches Age/3 as well as Age/34.

Notice that the query Age* matches a myriad of tags including Agent, Agent-state, and Agent-property.

Logical queries

In the following A and B represent HED expressions that may contain multiple comma-separated tags and parenthesized groups. A and B may also contain group queries as described in the next section. The expressions for A and B are each evaluated and then combined using standard logic.

Query form

Example query

Matches

Does not match

A, B
Match if both A and B
are matched.

Event, Sensory-event

Event, Sensory-event
Sensory-event, Event
(Event, Sensory-event)

Event

A and B
Match if both A and B
are matched.
Same as the comma notation.

Event and Sensory-event

Event, Sensory-event
Sensory-event, Event
(Event, Sensory-event)

A or B
Match if either A or B.

Event or Sensory-event

Event, Sensory-event
Sensory-event, Event
(Event, Sensory-event)
Event
Sensory-event

Agent-trait

~A
Match groups that do
not contain A
A can be an arbitrary expression.

{ Event, ~Action }

(Event)
(Event, Animal-agent)
(Sensory-event, (Action))

Event
Event, Action
(Event, Action)

@A
Match a line that
does not contain A.

@Event

Action
Agent-trait
Action, Agent-Trait
(Action, Agent)

Event
(Action, Event)
(Action, Sensory-event)
(Agent, (Sensory-event, Blue))

Group queries

Tag grouping with parentheses is an essential part of HED annotation, since HED strings are independent of ordering of tags or tag groups at the same level.

Consider the annotation:

Red, Square, Blue, Triangle

In this form, tools cannot distinguish which color goes with which shape. Annotators must group tags using parentheses to make the meaning clear:

(Red, Square), (Blue, Triangle)

Indicates a red square and a blue triangle. Group queries allow analysts to detect these groupings.

As with logical queries, A and B represent HED expressions that may contain multiple comma-separated tags and parenthesized groups.

Query form

Example query

Matches

Does not match

{A, B}
Match a group that
contains both A and B
at the same level
in the same group.

{Red, Blue}

(Red, Blue)
(Red, Blue, Green)

(Red, (Blue, Green))

[A, B]
Match a group that
contains A and B.
Both A and B could
be any subgroup level.

[Red, Blue]

(Red, (Blue, Green))
((Red, Yellow), (Blue, Green))

Red, (Blue, Green)

{A, B:}
Match a group that
contains both A and B
at the same level
and no other contents.

{Red, Blue:}

(Red, Blue)

(Red, Blue, Green)
(Red, Blue, (Green))

{A, B: C}
Match a group that
contains both A and B
at the same level
and optionally C.

{Red, Blue: Green}

(Red, Blue)
(Red, Blue, Green)

(Red, (Blue, Green))
(Red, Blue, (Green))

These operations can be arbitrarily nested and combined, as for example in the query:

[A or {B and C} ]

In this query Ordering on either the search terms or strings to be searched doesn’t matter, precedence is generally left to right outside of grouping operations.

Wildcard matching is supported, but primarily makes sense in exact matching groups. You can replace any term with a wildcard:

Query form

Example query

Matches

Does not match

?
Matches any tag or group

{A and ?}

(A, B}
(A, (B))

(A)
(B, C)

??
Matches any tag

{A and ??}

(A, B}

(A)
(B, C)
(A, (B))

???
Matches any group

{A and ???}

(A, (B))

(A)
(B, C)
(A, B)

Notes: You cannot use negation inside exact matching groups {X:} or {X:Y} notation.
You cannot use negation in combination with wildcards ( ?, ??, or ??? )
In exact group matching, or matches one or the other, not both: {A or B:} matches (A) or (B), but not (A, B)

Where can HED search be used?

The HED search facility allows users to form sophisticated queries based on HED annotations in a dataset-independent manner. These queries can be used to locate data sets satisfying the specified criteria and to find the relevant event markers in that data.

For example, the factor_hed_tags operation of the HED file remodeling tools creates factor vectors for selecting events satisfying general HED queries.

The HED-based epoching tools in EEGLAB can use HED-based search to epoch data based on HED tags.

Work is underway to integrate HED-based search into other tools including Fieldtrip and MNE-python as well into the analysis platforms NEMAR and EEGNET