Help With Regular Expressions
The dictionary supports Regular
Expression (RegEx) searches. These
provide a very powerful way of searching the dictionary but may not be
necessary for all users and can be difficult to understand at first.
A Regular Expression search is entered in the same way as any
other
search. When using regular expressions we usually recommend that you
select the Part
Word and Accent
Sensitive options. This is
because the Whole
Word and Accent
Insensitive options insert extra
regular expressions into your search and this can have unexpected
results when used with your own Regular Expression searches, preventing
you finding what you're looking for.
Here are some examples of what Regular Expression
searches can do:
1. Lenition
Using
the asterisk, you can locate both lenited and unlenited forms of a word
in the same search. For example:
- fh*uar will find
entries containing fuar
or fhuar.
- mh*ór will find
entries containing mór
or mhór.
2. Alternate Letters
If
you want to locate words that you know suffer from spelling
alternations, you can use the following ( | ) to locate both in the
same search. For example:
- (ao|adh)bhar will
locate both adhbhar
and aobhar
in the same search.
- (t|d)àinig will
locate both tàinig
and dàinig
in the same search.
3. Words Ending or Beginning in...
If
you use [[:>:]] or [[:<:]], you can search for
words ending
or beginning in a particular string. For example:
- éis[[:>:]]
will find all words ending
in éis or
with punctuation (for example a
hyphen) immediately after éis
such as féis, séis, créis, éis-bhreith...
-
[[:<:]]éis will find all word beginning
in éis or
with immediately after punctuation (for example a
hyphen) such as éisteachd,
éist, éisleanach, -éisg, éis-bhreith...
4.
Vowel Permutations
There are various ways in
which you can search for words with varying vowels between consonants.
For example:
- c.s will find words
where any vowel is between c and s, such as cas, càs, cus, cìs...
- c[aouei]r will find
words where either an a, o, u, e or i is between c and r, such as car, cor, cur...
- c[aoueiàòóùèéì]r
will find words where either accented or un-accented a, o, u,
e or i is between c and r, such as car,
càr, cìr, cor, cur...
5.
Phrase Search
To search for a phrase (as in, two words which are not next to each
other), such as a cat
may look at the king, you use the following expression:
[[:<:]]cat[[:>:]].*[[:<:]]king[[:>:]]
This will find all entries
that contain the word cat
then anything else then
the word king.
To search for a different pairing, simply replace the words for cat and
king.
The %
wildcard character which could be used in older versions of the
dictionary is still supported.
6.
More About RegEx
There
are several different dialects of Regular Expression which are
implemented in slightly different ways. The dictionary uses MySQL
Regular Expressions and more information about these can be found here.
There are also several good general resources online
explaining Regular
Expressions in detail, for example here.