Industry: Finance

Untangling Script Spaghetti using an Embedded DSL

... and maintenance becomes joyful


Read Story
Image AI generated with Google Gemini

Please note: The English version of this success story was translated using AI to make it accessible to our international audience.

tl;dr

  • Industry: Finance / Consumer Loans
  • Modernizing Legacy Data Quality Scripts
Before After
Unmaintainable Script Mess Clean Abstraction
Maintenance Becomes a Pleasure

Situation: Low-Level Scripts Check Data Quality Rules

The client company operates a machine learning model. The model categorizes and tags bank transaction data.

To train this model, manually labeled training data is required. The assigned labels must adhere to certain business data quality rules. Previously, a script was used to check the labeled data for rule compliance.

The rules were hard-coded as low-level Pandas operations—without any abstraction.

Challenge: The rules had become unmaintainable

Maintaining the data quality rules in the existing solution was causing significant problems:

  • The rule language did not support Unicode, necessitating workarounds for attributes containing special characters.
  • While the rule language supported Boolean joins between sub-rules, these joins were extremely error-prone: Operator precedence and parentheses were not clearly visible to users, and reserved keywords (AND, OR) led to collisions. There was no effective tool support.
  • The rule language did not support automated testing. Rules could not be automatically validated with examples.

Solution: An embedded DSL provides clarity

A metamorphant partner developed a simple embedded DSL in Python in collaboration with the client. This provides the rule maintainer with meaningful building blocks. Rules are now described at a higher level of abstraction.

Each building block is testable in itself. Each rule is testable in itself.

The analysis revealed a key finding: recurring patterns emerged in the Boolean combination of multiple rules. The same combinations were frequently repeated in a copy-paste fashion. By capturing the underlying logic and extracting it into separate building blocks, these combinations were transformed into simple function calls: the complex chains were eliminated and are now rarely encountered in the daily work of maintainers.

Results: Maintenance is a Joy

The success of the project was quickly evident: after a prolonged period of stability, the model required further modification. The change was implemented in less than an hour, from start to finish—quality-assured and deployed, without costly context switches or waiting times, and without the risk of regressions in the existing rules.

Technologies

  • Python
  • Pandas
  • Azure DevOps
  • jupyter
  • Conda

Why was phant the right partner?

  • DSLs in the toolbox—but used judiciously; Pragmatic instead of dogmatic
  • Fast feedback loops – short feedback loops and a close connection to customer benefits are part of Metamorphant’s DNA
  • Focus on testability – Metamorphants are capable of many things, but first and foremost, they are good engineers and follow good engineering practices
  • Legacy know-how – Metamorphants understand the long-term costs of software and focus on long-term maintainability

Contact us