Data Science Solution for the Fintech Sector

Merchant recognition solution based on ML.



Data collection service company

Target audience

European banks


Insufficiency of data for the attraction of relevant audiences and maintenance of the desired level of customer retention.


Data science application based on machine learning that allows banks to extract valuable data about merchants for its further analysis, which results in stronger strategy-building.


Emerline was involved in the development of a data science solution aimed at providing European banks with detailed, categorized information about the use of their products — debit and credit cards. The goal was to establish a mechanism that would automatically detect valuable information for banks about merchants based on payments of their customers and then split this data into categories. In this way, a bank would be able to determine who are their key merchants and receive insights into their customer behavior. accompanying risks.

Our team was responsible for the creation of the ML algorithms that would ensure the extraction of this data: accompanying risks.

  • Merchant’s URL scoring based on different criteria
  • Extraction of valid information
  • Product recognition
  • Merchant categorization accompanying risks.

    The process of categorization had to be built with respect to the list of categories provided by the client. accompanying risks.

    One more challenge was to optimize the process of URL scoring in the way to make it as closely as possible to how humans select websites when browsing for information.

How Does the Solution Work?

The principle of how the delivered solution works can be described as follows:

  1. A customer makes a purchase with a debit or credit card.
  2. Brief information about the purchase (where it took place, what’s the name of a merchant, MCC code, etc.) goes to the bank.
  3. From a banking server, the info transfers to the server of the data science solution we worked on.
  4. Servers browse for the merchant’s information to define with the help of machine learning what products or services the merchant provides.
  5. In accordance with the extracted data, the solution determines related to the merchant categories and sends this information to the bank.

Technologies Used

In the course of the project, our data science specialists took advantage of various technologies for the establishment of successful task completion by the system:

Merchant classification Gensim; Doc2vec; tf-idf; logistic regression

URL scoring XGBoost; LightGBM; Log Regression; Optuna

Product recognition Default Python; Numpy


Promptly addressing challenges during the development, including those related to invalid URLs and compiling a list of products for system recognition, our team provided the client with a well-thought-out solution that allows gathering important to banks information about their customer behavior and merchants. Having such a system at hand, the client can strengthen their position on the market of data collection service providers, offering it to banks for the extraction of useful insights.

More Case Studies

Smart asset management platform for optimizing IT asset management in the companies big and small. It lets businesses monitor and manage all IT assets across company facilities, schedule equipment replacements, and automate routine tasks

CRM System for the Legal Industry

The development of a legal system that allows the customer to provide end-to-end legal document services, featuring all the important integrations for communication, payment, dynamic document generation, etc.

Admin Panel and Web Crawler

Emerline’s team was responsible for the creation and support of the client’s admin panel, client-side programming of the main client’s solution, and the development of a crawler that gathers information from different sources for its further analysis.