Beyond the Boolean Search – Better Searching Techniques for Enterprise Search

Posted by Christian Schömmer on Oct 1, 2015 11:07:02 PM

We’ve all been frustrated searching the Internet or corporate intranet for the right item.  Thousands of files can pop up and it sometimes takes up to 15 minutes or more to find the right search entry.  Search is supposed to quickly discover the right information in the right timeframe, but unless you understand the tricks of the trade, it can be difficult to narrow down the results.

While most people understand Phrase Search (quotes around keywords such as “apple pie”) and Boolean Search Strings (quotes around the most important words, adding the word “or” to link the terms, such as “apple pie” or custard), there are several additional search techniques to maximize results.  Here are three simple, effective search practices that you may not know about:

 

Fuzzy Search

We’ve all heard about “fuzzy math” and “fuzzy logic,” but how many understand the benefits of fuzzy search?  Performing  a fuzzy search involves using the tilde (~) symbol at the end of a single-word term. This helps if you’re unsure of the correct word.   For example if you create a fuzzy search for roam~, you will find terms like “foam” and “roams.”  You can also place a value from 0-1 to create a more specific search.  For example: “roam~0.8” will turn up more closely linked words, like roams.  Keep in mind that a default value or 0.5 is used if a parameter is not chosen.

 

Wildcard Search

Uncertain about the proper spelling or letters in a search word? Wildcards are exceptionally useful.  There are two popular wildcard symbols, * and ?.  The * asterisk wildcard substitutes all letters that follow (after the asterisk) in a word.  For example: Info* will search all words that begin with info., such as information, informant, etc. The question mark (?) wildcard substitutes one letter in the search word.  Example: J?nsen will search Jensen, Jansen, Jonsen, etc.

 

Boosting

For any search, there is a relevance level of matched documents based on the terms found.  This determines the order of documents presented in the search results list.  You can change the default relevance, and thus the order of the search results by boosting words or phrases. To boost a term, use the caret symbol (^) with a boost factor (a number) at the end of the term.  The higher the boost factor, the more relevant the term will be. Boosting allows you to control the relevance of a document by boosting the value of a term found within the document. For example, if you are searching for apple pie and want the term “apple” to be more relevant, boost it using the ^ symbol along with the boost factor. You would type: apple ^4 pie.  This will make documents with the term apple appear more relevant. You also can boost phases as in the example “apple pie” ^5 “favorite desserts.”  The default boost factor is 1 and must be a positive number.

 

Whether you’re searching for a document on your work server or on the Internet, these tips can take some of the frustration out of searching and help you quickly locate documents.  To learn more about enterprise search best practices, download this whitepaper.


Topics: Search Tips, english, blogpost

We solve business problems with data analysis tailored to your needs by leveraging natural language processing, guided machine learning, linguistic analysis and years of experience. Ready to start seeing value from your content?

Contact Us