Skip to main content

Keeping Humans in the Loop: Strengths and Shortcomings of Treasury’s AI Use Case Inventory

The Department of the Treasury's AI Use Case Inventory, published in January of this year, represents a welcome step toward transparency in the realm of governmental AI usage. Published pursuant to the Advancing American AI Act, various executive orders, and Office of Management and Budget (OMB) guidance documents, the report details sixty different uses of AI within the IRS and many more throughout the Treasury.

This required inventory of AI usage helps ensure that taxpayers have visibility into how their data and tax dollars are being put to use by those charged with administering the tax code. The IRS’s AI use cases are distinct from those of other parts of the government because they focus overwhelmingly on enforcement of the tax code, not just administrative efficiency gains. For example, the Government Accountability Office (GAO) highlights that many enforcement uses involve the auditing of partnerships, corporations, and individual taxpayers.

Ensuring that AI is used properly and carefully in these cases is crucial to protect taxpayer rights.  Audits carry the threat of enforced penalties if taxpayers are determined not to be in compliance. The distinction between AI assisting in and autonomously completing activities that could lead to governmental action against taxpayers is deeply important. AI must never become an unreviewable source of enforceable government action; human input and review must be required before any enforcement action is taken. Staff training, robust oversight, and avenues for  appeals are needed as safeguards. AI decisions that can result in the impositions of taxes, penalties, fines, or legal action must include human review to catch and correct errors. 

GAO’s report enhances transparency around the IRS’s AI use cases in two ways. First, it  includes an Excel spreadsheet of those which are unclassified. IRS use cases vary between efficiency and enforcement functions, and fall into two broad groups: employee-facing and taxpayer-facing AI applications. 

Some notable IRS-internal, employee-focused use cases include:

  • The IBM Cognitive Data Mapper and AWS and LLAMA (Meta AI) Tools – This AI use case analyzes legacy code used by the IRS and converts it into modern coding language. This is important because legacy code is often millions of lines long and drastically inefficient when compared to newer code, increasing computer costs and operational opacity.

  • The Internal Revenue Manual Research Aid – This is an internal chatbot that answers IRS employee questions about policies and procedures in the Internal Revenue Manual, the IRS’s official compendium of internal policies. This has the potential to save countless hours of employee time that would otherwise be spent combing through complex rules or requesting assistance from other employees.

  • The ServiceNow Generative AI Pilot – This case uses generative AI to summarize incidents, generate notes, and create knowledge articles to increase efficiency in IRS IT processes by reducing the time spent on busywork and resolving incidents.

Some notable external, taxpayer-facing use cases include:

  • Taxpayer Notice Modernization – This use case transforms complex tax language into clear, concise language, which an IRS official reviews for accuracy and relevance before dissemination. The intended benefits of this use case include making it easier for taxpayers to navigate their tax obligations and reducing taxpayer errors and inquiries.

  • Identity Verification with ID.me – This use case is integrated within the IRS’s identity verification process and uses machine learning to detect submissions of forged identity documents to cut down on fraud and investigatory costs. 

  • Chatbots – Uses 18 through 49 on GAO’s supplemental AI Use Case Inventory Excel spreadsheet are all taxpayer-facing chatbots that taxpayers can use to ask questions, find information, or be referenced to the live human assistance they need. These chatbots improve transparency and simplicity by decreasing the learning curve for taxpayers who only wish to remain in compliance with their federal tax burdens.

GAO’s report also enhances transparency through its eight recommendations to help the IRS improve its AI usage and use case reporting. These focus on identifying and remedying staff skill gaps that hinder AI adoption, reinforcing that all unclassified use cases are subject to reporting requirements, ensuring that contractors comply with the IRS’s AI governance guidelines, improving intergovernmental coordination to reduce use case duplication, and establishing AI performance metrics. The IRS agreed with all eight suggestions and outlined the steps it is taking to implement them (see Appendix IV of the GAO report for specifics). 

In addition to GAO’s use case reforms, the public Inventory itself could be further improved by either being republished at a regular interval—e.g., monthly, quarterly, or annually—or being reposted on the Treasury website each time an update is made, with a viewable edit history that allows the public to see when changes occurred (the fact that GAO’s report relies on a use case inventory from June of last year demonstrates a deficit in real-time transparency). Enhanced transparency about how inclusion in the Inventory (as opposed to remaining classified) is decided as well as how the “high-impact” and “not high impact” statuses are defined would be a bonus for honest government. 

Apart from these areas for improvement, the Treasury’s Use Case Inventory represents a significant step toward transparency and sets an example for other federal, state and local agencies to follow more broadly. Indeed, the broad public trust needed for government at all levels to pursue ambitious efficiency reforms will depend on transparent communication. The Inventory serves as a good example deserving of recognition.