What is Fuzzy Matching?

Why CPAs Should Know About This Powerful Tool

DOWNLOAD PDF

TECHNOLOGY ISSUES

By Shivam Arora, CPA

In accounting, we regularly encounter situations where work is being performed manually and there is substantial scope of automation. For example, a CPA was downloading sales tax permits for their client’s vendors to perform scoping for potential tax refunds. After downloading a permit, they would move it manually to the respective vendor’s folder.

This process, they complained, took hours of their time, especially since multiple state permits had to be downloaded per vendor. However, the even more frustrating fact was that all file names “almost” contained their vendor’s name, with omissions of letters and additions of special characters that did not seem to follow any one pattern.

This is a common problem. There are many business cases where practitioners spend time over text matching tasks that are intuitively obvious but do not follow a pattern and therefore, must be performed manually. Fortunately, solutions exist, such as fuzzy matching.

RELATED CPE:

Artificial Intelligence for Accounting and Financial Professionals

Fuzzy Matching

Fuzzy matching encompasses an umbrella of statistical techniques that compare and match approximately equal strings. These techniques employ statistical rules to arrive at a relative degree of truth on the similarity between two strings, in contrast to a Boolean approach, which uses a separate, hard-coded format for each task to provide a Yes/No answer.

The concept of fuzzy matching is analogous to the substance over form principle in accounting. It’s the same reason why passthrough entities do not pay income tax even though they are technically separate legal entities to their owners or why the IRS sometimes classifies unusually large salary payments to owners as dividends even though they are technically salary payments.

Fuzzy matching, like these cases, gives preference to substantial equivalence between strings over their technical form.

The Underlying Logic

There are several approaches available to fuzzy match data, but I'm going to go briefly over the most common one. The Levenshtein Distance (LD) is commonly used to establish similarity between two strings. It is the minimum number of single character edits that are required for changing either of the two strings into the other. An edit can refer to a character’s insertion, deletion or replacement. Consider the following strings:

charlie_vendor
$charl_vndr$

Assume that the naming convention of vendors in a system is “{name}_vendor.” It is intuitively obvious that the file downloaded is for the vendor Charlie. However, unless all downloaded files follow the same naming convention as above, a Boolean approach to matching will declare both strings unequal.

When I run LD-based fuzzy matching (LDFM) in Python, I get an LD of 6. This means that the shortest number of single-character changes to exactly match the file name and the vendor’s name is 6. Converting it into the LD ratio (using a formula I will not delve into), I obtain .77.

What I now have is a quantified degree to which both strings are similar: my computer understands that both strings are about 77% similar. It still knows that they are not equal; it has just established equivalence

Applications of Fuzzy Matching in Accounting

There can be several applications of fuzzy matching in accounting. A few of them follow.

File Renaming. As with the above case, fuzzy matching can be used to rename downloaded files and match them to their respective group. Names of files downloaded from the internet often contain either truncated text or unwanted characters.

Support Accounting Processes. Fuzzy matching can support accounting processes such as bank reconciliations, inventory tracking and evidence gathering for various types of audits.

Internal Controls. Fuzzy matching can detect duplicate AP payments with minor variations, compare purchase orders to deliver invoice/bill of lading and enforce data entry checks. In case of fraud, it can also aid in identifying matches across different databases or comparing fraudulent acts across different time periods.

Preprocessing for ML. With the arrival of artificial intelligence, organizations are increasingly utilizing machine learning (ML) techniques. A substantial amount of ML in the financial space occurs on data generated by accounting systems. By facilitating preprocessing of data using fuzzy matching techniques, organizations can develop robust and accurate ML models.

A Coding Exercise

One can perform fuzzy matching in Excel (refer to the article "Excel: Fuzzy Matching" by Bill Jelen in Strategic Finance magazine). Unsurprisingly though, the functionality is extremely limited and there is little clarity on what technique is used. A better alternative may be the programming language Python.

Python is a high-level programming language that is general-purpose; it can be used to code for a wide variety of situations. The beauty of Python is that it is intuitive and relatively easy to learn, which is why it is used extensively in business. It hosts numerous libraries that are specifically designed for business-related tasks.

See this: Fuzzy Matching Exercise

Case Examples

Consider the below as examples of how helpful fuzzy matching can be for accountants and auditors.

1. A CPA is performing a quarter-end bank reconciliation. There are 300+ entries on both sides. The CPA notices that transaction descriptions on bank statements are similar to those in the books, albeit with expected differences such as truncations, word order and unwanted characters. Using LDFM, the CPA can match 270 transaction descriptions between the bank and the books. The CPA also verifies that the corresponding amounts across these transactions are equal.

The CPA now begins to reconcile the remaining few transactions on both sides. Using fuzzy matching has greatly reduced the manual workload.

2. A tax consultant is working on a reverse audit for one of their clients. The consultant must download sales/use tax permits for the client’s 500+ vendors to ascertain the type of permits held in the relevant states.

For simplicity, assume that a single file contains all permits for one vendor. The consultant has an Excel file with a list of all vendors. Instead of manually linking each vendor to their corresponding permit, the consultant employs LDFM. This results in a >90% confidence match for 460 vendors.

After a cursory review of the matches to ensure accuracy, the consultant needs to only focus on the unmatched vendors for linking permits manually. If it takes 30 seconds for the consultant to browse through all permits to find the correct permit for each vendor and they have a code already available to perform LDFM, they have just reduced the task time by close to four hours.

3. One warehouse of a manufacturing company uses LDFM to compare raw materials ordered on a purchase order to those received and listed on the invoice. This helps the warehouse detect not only discrepancies between the quantity of items ordered, but also between their type.

Over the years, the warehouse has been able to reduce purchase return-related costs by up to 40% by refusing delivery of suboptimal orders. You can read the outstanding use case of fuzzy matching in fraud examination in an article written by Ehsanelahi in Data Ladder titled "Fuzzy Matching 101: Cleaning and Linking Messy Data."

A Powerful Tool

Given the nature of accounting work, fuzzy matching techniques can be a powerful tool in a CPA’s arsenal.

Fuzzy matching can be performed in Excel but is much more powerful when performed in a programming language such as Python, which is an intuitive programming language that, in addition to core software development, has extensive use cases in a business setting.

As evident, it is not hard to follow most (if not all) aspects of the Python exercise above even without basic knowledge of the language. CPAs should consider learning a programming language to automate much of the manual tasks they perform.

About the Author: Shivam Arora, CPA, is a data scientist. Arora holds dual master’s degrees: an MS in Accounting and an MS in Business Analytics. As an applied Artificial Intelligence (AI) consultant at one of the largest consulting firms in the world, Arora specializes in applied AI for accounting and finance. Research interests include financial modeling and statistical relationships in the financial markets, application of AI to accounting and Robotic Process Automation (RPA). Email shivam.arora@mavs.uta.edu.

 

 

  • SECURE Act 2.0

    SECURE 2.0 and the One Big Beautiful Bill Act

    This article provides a snapshot of the key provisions of the One Big Beautiful Bill Act and retirement provisions in SECURE 2.0. Together, these laws are reshaping retirement planning through new compliance requirements and expanded advisory opportunities, with changes taking effect in 2026 and beyond that call for proactive guidance for clients and employers.
    View Article
  • CPE: Share Repurchases - Playing in the Big Leagues

    Stock buybacks have grown from a once-restricted practice into a dominant way corporations return cash to shareholders. While they return more cash to shareholders than dividends, the financial-reporting and tax risks that large buybacks create must be managed – from negative equity and distorted ratios to rising excise-tax costs.
    View Article
    Tax
  • Volunteer

    Welcoming 2026 with Purpose and Possibility

    Stepping into 2026 brings a wave of opportunity for TXCPA members. This issue of Today’s CPA covers key updates like H.R. 1, SECURE 2.0 and retirement planning, plus insights on AI-driven tax compliance and IRS technology trends. Explore ways to grow, give back, and connect through TXCPA programs and events.
    View Article
  • IRS Use of Artificial Intelligence and Data Analytics to Modernize Operations

    The IRS is rapidly expanding its use of artificial intelligence and data analytics to modernize operations, reshaping compliance, enforcement and taxpayer interactions. From AI-powered chatbots that ease service demands to advanced analytics, the agency is harnessing technology to manage massive data volumes—while walking a careful line between efficiency, fairness and taxpayer trust.
    View Article
    IRS
  • Tax Services

    AI-Powered Tax Compliance, Part 1: How Machine Learning is Revolutionizing Sales and Use Tax

    Business Problem Solved: Companies can struggle to stay on top of complex, high-volume sales and use tax obligations, and this article shows how a hybrid rules-plus-machine-learning approach enables earlier detection, reduces manual review and ensures scalable, auditable compliance.
    View Article
  • Your TXCPA Calendar: Key Dates, Leadership Opportunities and CPE Ahead

    Plan your year with this snapshot of essential events, deadlines and learning opportunities for TXCPA members.
    View Article
    Volunteer
  • fraud

    The Vicious Cycle of Cheating in Accounting: From Students to Practitioners

    Cheating among accounting students and practitioners is increasing and threatens public trust in the profession. Research shows that unethical behavior in school often carries into professional practice. Stronger penalties and dedicated ethics education are needed to break this cycle and reinforce integrity as a core professional value.
    View Article
  • What’s Happening Around Texas - January-February 2026

    TXCPA members are making a big impact! During Accounting Opportunities Month and our annual Month of Service, 68 volunteers reached over 3,000 students and supported local charities across Texas. From hosting career workshops and networking events to packing meals and donating toys, chapters showed the power of giving back.
    View Article
    volunteer for my chapter
  • Texas State Board of Public Accountancy

    Turning Challenges into Wins: How TXCPA Advocates for You

    TXCPA delivered major wins for Texas CPAs during the 2025 legislative session, strengthening the profession at a pivotal moment. New legislation expanded pathways to CPA licensure, modernized practice mobility for out-of-state CPAs and reinforced public protection. These successes highlight the growing impact of TXCPA’s advocacy and the critical role of the TXCPA PAC in safeguarding the CPA license.
    View Article
  • TXCPA Thanks Our 2025-2026 Professional Group Membership Program Participants!

    A big thank you to all the firms and organizations that joined or renewed with TXCPA’s Professional Group Membership program. To simplify renewals and maximize your team’s benefits, be sure to explore our group billing option.
    View Article
    Membership
  • TSBPA

    Steadfast Leadership: William Treacy’s 35 Years at the Texas State Board of Public Accountancy

    For three decades, William Treacy has led the Texas State Board of Public Accountancy with one guiding principle: protect the public. His tenure reflects a career defined by integrity, public service and steady leadership in a rapidly changing profession.
    View Article
  • Implications of Section 301 Tariff Actions

    Section 301 tariffs during President Trump’s first term were associated with reducing the U.S. trade deficit with China, though the overall deficit continued to grow. Data suggests tariffs shifted trade flows rather than curbing demand. For CPAs, these insights are key to assessing how renewed tariffs could impact trade patterns, costs and global tax planning.
    View Article
    Transfer pricing
  • Trusted Advisor

    Why Exit Planning Should Be on Every CPA Firm’s Radar

    Exit planning is quickly becoming a high-impact advisory opportunity for CPAs. While many business owners know they will eventually exit, few are truly prepared, and CPAs are ideally positioned to close that gap through trusted relationships and financial insight.
    View Article
  • Governance is Your Growth Engine: Build Value and Outrun Private Equity

    As private equity reshapes the accounting landscape and traditional partnership models strain under talent shortages and succession challenges, strong governance has become the real differentiator. By replacing ad hoc decision-making with clear roles, accountability, performance metrics and disciplined planning, firms can turn chaos into clarity and intention into execution.
    View Article
    Public practice
  • talent retention

    How Employee Resource Groups Can Drive Diversity in an Accounting Organization

    This article dives into how Employee Resource Groups (ERGs) help firms build cultures that attract, engage and retain people by turning inclusion into action. Firms that invest in ERGs create workplaces where employees are more engaged, loyal and likely to thrive.
    View Article
  • Take Note

    In this edition of Take Note: 2026 Midyear Leadership Council and Members Meeting; Support Through the Accountants Confidential Assistance Network (ACAN); CGMA® Designation; 2026 CPE Programs; TXCPA’s Career Center
    View Article
    TXCPA online learning
  • Classifieds

    The Classifieds section offers a centralized resource for practice sales, buyers seeking to purchase firms and specialized services. It helps members efficiently connect with opportunities tailored to their professional needs.
    View Article

CHAIR
Mohan Kuruvilla, Ph.D., CPA

PRESIDENT/CEO
Jodi Ann Ray, CAE, CCE, IOM

CHIEF OPERATING OFFICER
Melinda Bentley, CAE

EDITORIAL BOARD CHAIR
Jennifer Johnson, CPA

MANAGER, MARKETING AND COMMUNICATIONS
Peggy Foley
pfoley@tx.cpa

MANAGING EDITOR
DeLynn Deakins
ddeakins@tx.cpa

COLUMN EDITOR
Don Carpenter, MSAcc/CPA

DIGITAL MARKETING SPECIALIST
Wayne Hardin, CDMP, PCM®

CLASSIFIEDS
DeLynn Deakins

Texas Society of CPAs
14131 Midway Rd., Suite 850
Addison, TX 75001
972-687-8550
ddeakins@tx.cpa

 

Editorial Board
Derrick Bonyuet-Lee, CPA-Austin;
Aaron Borden, CPA-Dallas;
Don Carpenter, CPA-Central Texas;
Rhonda Fronk, CPA-Houston;
Aaron Harris, CPA-Dallas;
Baria Jaroudi, CPA-Houston;
Elle Kathryn Johnson, CPA-Houston;
Jennifer Johnson, CPA-Dallas;
Lucas LaChance, CPA-Dallas, CIA;
Nicholas Larson, CPA-Fort Worth;
Anne-Marie Lelkes, CPA-Corpus Christi;
Bryan Morgan, Jr, CPA-Austin;
Stephanie Morgan, CPA-East Texas;
Kamala Raghavan, CPA-Houston;
Amber Louise Rourke, CPA-Brazos Valley;
Shilpa Boggram Sathyamurthy, CPA-Houston, CA
Nikki Lee Shoemaker, CPA-East Texas, CGMA;
Natasha Winn, CPA-Houston.

CONTRIBUTORS
Melinda Bentley; Kenneth Besserman; Kristie Estrada; Holly McCauley; Craig Nauta; Kari Owen; John Ross; Lani Shepherd; April Twaddle; Patty Wyatt