How to Perform Fuzzy Matching of Email Addresses and Telephone Numbers Using Elasticsearch?

Front page > Programming > How to Perform Fuzzy Matching of Email Addresses and Telephone Numbers Using Elasticsearch?

How to Perform Fuzzy Matching of Email Addresses and Telephone Numbers Using Elasticsearch?

Published on 2024-11-07

Browse:365

How to Perform Fuzzy Matching of Email Addresses and Telephone Numbers Using Elasticsearch?

Fuzzy Matching Email or Telephone Using Elasticsearch

Elasticsearch offers built-in capabilities for fuzzy matching of email addresses and telephone numbers.

Email Matching

To match email addresses ending with a specific domain (e.g., @gmail.com):

{
    "query": {
        "term": {
            "email": ".*@gmail.com"
        }
    }
}

Or, to match emails containing a specific string:

{
    "query": {
        "match": {
            "email": {
                "query": "sales@*",
                "operator": "and"
            }
        }
    }
}

Telephone Matching

For fuzzy matching of telephone numbers, you can use the following pattern:

{
    "query": {
        "prefix": {
            "tel": "136*"
        }
    }
}

This will match all phone numbers starting with "136".

Performance Optimization

To improve performance for fuzzy matching, consider using custom analyzers that leverage n-gram or edge n-gram token filters. These filters break down the text into smaller tokens, making it easier for Elasticsearch to perform fuzzy matching.

Email Analyzer Configuration:

{
  "settings": {
    "analysis": {
      "analyzer": {
        "email_analyzer": {
          "type": "custom",
          "tokenizer": "standard",
          "filter": [
            "lowercase",
            "name_ngram_filter",
            "trim"
          ]
        }
      },
      "filter": {
        "name_ngram_filter": {
          "type": "ngram",
          "min_gram": "3",
          "max_gram": "20"
        }
      }
    }
  }
}

Telephone Analyzer Configuration:

{
  "settings": {
    "analysis": {
      "analyzer": {
        "phone_analyzer": {
          "type": "custom",
          "char_filter": [
            "digit_only"
          ],
          "tokenizer": "digit_edge_ngram_tokenizer",
          "filter": [
            "trim"
          ]
        }
      },
      "char_filter": {
        "digit_only": {
          "type": "pattern_replace",
          "pattern": "\\D ",
          "replacement": ""
        }
      },
      "tokenizer": {
        "digit_edge_ngram_tokenizer": {
          "type": "edgeNGram",
          "min_gram": "3",
          "max_gram": "15",
          "token_chars": [
            "digit"
          ]
        }
      }
    }
  }
}

Latest tutorial More>

Why Does Microsoft Visual C++ Fail to Correctly Implement Two-Phase Template Instantiation?
The Mystery of "Broken" Two-Phase Template Instantiation in Microsoft Visual C Problem Statement:Users commonly express concerns that Micro...

Programming Posted on 2025-07-04
How to prevent duplicate submissions after form refresh?
Preventing Duplicate Submissions with Refresh HandlingIn web development, it's common to encounter the issue of duplicate submissions when a page ...

Programming Posted on 2025-07-04
Guide to Solve CORS Issues in Spring Security 4.1 and above
Spring Security CORS Filter: Troubleshooting Common IssuesWhen integrating Spring Security into an existing project, you may encounter CORS-related er...

Programming Posted on 2025-07-04
How to implement custom events using observer pattern in Java?
Creating Custom Events in JavaCustom events are indispensable in many programming scenarios, enabling components to communicate with each other based ...

Programming Posted on 2025-07-04
Python Read CSV File UnicodeDecodeError Ultimate Solution
Unicode Decode Error in CSV File ReadingWhen attempting to read a CSV file into Python using the built-in csv module, you may encounter an error stati...

Programming Posted on 2025-07-04
How to Simplify JSON Parsing in PHP for Multi-Dimensional Arrays?
Parsing JSON with PHPTrying to parse JSON data in PHP can be challenging, especially when dealing with multi-dimensional arrays. To simplify the proce...

Programming Posted on 2025-07-04
How to avoid memory leaks when slicing Go language?
Memory Leak in Go SlicesUnderstanding memory leaks in Go slices can be a challenge. This article aims to provide clarification by examining two approa...

Programming Posted on 2025-07-04
Access and management methods of Python environment variables
Accessing Environment Variables in PythonTo access environment variables in Python, utilize the os.environ object, which represents a mapping of envir...

Programming Posted on 2025-07-04
Do I Need to Explicitly Delete Heap Allocations in C++ Before Program Exit?
Explicit Deletion in C Despite Program ExitWhen working with dynamic memory allocation in C , developers often wonder if it's necessary to manu...

Programming Posted on 2025-07-04
$Why Doesn\'t Firefox Display Images Using the CSS `content` Property?$
Why Doesn\'t Firefox Display Images Using the CSS `content` Property?
Displaying Images with Content URL in FirefoxAn issue has been encountered where certain browsers, specifically Firefox, fail to display images when r...

Programming Posted on 2025-07-04
How to effectively modify the CSS attribute of the ":after" pseudo-element using jQuery?
Understanding the Limitations of Pseudo-Elements in jQuery: Accessing the ":after" SelectorIn web development, pseudo-elements like ":a...

Programming Posted on 2025-07-04
How Can I Efficiently Read a Large File in Reverse Order Using Python?
Reading a File in Reverse Order in PythonIf you're working with a large file and need to read its contents from the last line to the first, Python...

Programming Posted on 2025-07-04
How to efficiently repeat string characters for indentation in C#?
Repeating a String for IndentationWhen indenting a string based on an item's depth, it's convenient to have an efficient way to return a strin...

Programming Posted on 2025-07-04
How Can I Customize Compilation Optimizations in the Go Compiler?
Customizing Compilation Optimizations in Go CompilerThe default compilation process in Go follows a specific optimization strategy. However, users may...

Programming Posted on 2025-07-04
Effective checking method for Java strings that are non-empty and non-null
Checking if a String is Not Null and Not EmptyTo determine if a string is not null and not empty, Java provides various methods.Option 1: isEmpty()For...

Programming Posted on 2025-07-04