Use this tool to find Catalan words that match certain lexical parameters (frequency, part-of-speech, etc.).

It tries to make the search for experimental material easier to the researcher.


Number of letters: Matches:

To look for matches two wildcards can be used:

'_' - Indicates that the character can be replaced by any letter (e.g., searching 'p_ay' would lead to 'play' or 'pray', etc.).

'%' - Indicates that this character can be replaced by any sequence (e.g., start with 'cr%ed' would lead to 'created' or 'crowded', etc.).

Relative frequency (per million):
Exactly:

Search for words with a specific length.

If you fill all fields to search by number of letters, the search for the exact number prevails over the search for minimum and maximum length.

The valid range of values is between 1 and 26 letters. The outliers are truncated to the nearest valid value.

Beginning with:

Search for words that start with a given sequence of letters.

It is important not to leave blanks in this box. The program distinguishes between stressed and unstressed words and is case sensitive.

The length of the sequence to search cannot be longer than 6 characters.

Exactly:

Search for words with a specific frequency of occurrence per million.

If all frequency search fields are filled the search for the exact value prevails over the others.

Since the values of relative frequency include several decimals, an exact search would yield few results and therefore would not be practical.

For this reason the search function works with integer values. For example, a search with a relative frequency of 20 will display words with a relative frequency between 20 and 20'99.

Equal to or longer than:

Search for words with length equal to or longer than specified.

The valid range of values is between 1 and 26 letters. The outliers are truncated to the nearest valid value.

The default value is 1.

Containing:

Search for words containing a particular sequence of letters.

It is important not to leave blanks in this box. The program distinguishes between stressed and unstressed words and is case sensitive.

The length of the sequence to search cannot be longer than 6 characters.

Equal to or greater than:

Search for words with a frequency of occurrence per million higher than the specified.

You can specify decimal values using the "." as a delimiter. Any other delimiter (e.g. , or ') will override the search criteria.

Equal to or shorter than:

Search for words with length equal to or shorter than specified.

The valid range of values is between 1 and 26 letters. The outliers are truncated to the nearest valid value.

The default value is 26.

Ending in:

Search for words that end in a certain sequence of letters.

It is important not to leave blanks in this box. The program distinguishes between stressed and unstressed words and is case sensitive.

The length of the sequence to search cannot be longer than 6 characters.

Equal to or less than:

Search for words with a frequency of occurrence per million lower than specified.

You can specify decimal values using the "." as a delimiter. Any other delimiter (e.g. , or ') will override the search criteria.

 

Part-of-speech:

Specify the parts-of-speech to which we want the words to belong to or not.

In case you do not select any box the search will look for all parts-of-speech.

   
Selection Criteria:

Determine whether you want words with a specific part-of-speech or the other one, or you want words having both parts-of-speech at the same time.

  Categories to avoid:

Specify the parts-of-speech to which we do NOT want the words to belong to.

Check a box as a category to avoid prevail over the box of displaying that category.

       
Substantive   Substantive  
Adjective   Adjective  
Adverb   Adverb  
Verb   Verb  
Pronoun   Pronoun  
Conjunction   Conjunction  
Interjection   Interjection  
Other   Other