Guide for Self-Employed Professionals
How to Clean Up Your Bank CSV for Tax Prep (2026)
You downloaded your bank transactions. Now you're staring at a spreadsheet full of truncated descriptions, weird formatting, and columns that don't make sense. Here's how to turn that mess into something you can actually use at tax time.
Key Takeaways
- Bank CSV exports are messy because descriptions are machine-generated, every bank uses a different column format, and pending transactions create duplicates.
- The six-step cleanup process: normalize columns, remove duplicates, fix amount formatting, clean up descriptions, standardize dates, and add a category column.
- Each major bank has specific quirks: Chase defaults to QFX format, Bank of America limits date ranges to 90 days, Wells Fargo splits amounts into separate debit and credit columns.
- A typical self-employed person has 500 to 1,500 transactions per year. Even at 10 seconds per transaction, that is over two hours of categorization work.
Every bank lets you download your transactions as a CSV file. In theory, this should be simple: date, description, amount. Three columns, clean data, done.
In practice? You get something like this:
03/15/2024,"CHECKCARD 0314 AMZN MKTP US*2K7X9 AMZN.COM/BILLWA",-47.32,,
03/15/2024,"PENDING: CHECKCARD 0314 AMZN MKTP US*2K7X9",-47.32,,
03/16/2024,"POS PURCHASE - STAPLES #0231 STORE 02319182",-23.87,,
03/16/2024,"ORIG CO NAME:GUSTO ORIG ID:1293847 DESC DATE:031624",2450.00,,
03/17/2024,"RECURRING PAYMENT AUTHORIZED ON 03/16 ADOBE *CREATIVE CLOUD 800-833-6687 CA",-54.99,,
That's five transactions, two of which are duplicates, and none of them have descriptions a normal person would recognize. If you're self-employed and need to pull your business expenses out of this for Schedule C, you're in for a long afternoon.
Let's fix that.
Why Bank CSV Exports Are So Messy
Before you start cleaning, it helps to understand why the data looks the way it does. Banks don't export CSVs for your convenience. They export raw transaction logs from their processing systems, and those systems weren't built for human readability.
Descriptions are machine-generated.
When you swipe your card at Staples, the transaction description doesn't say “Staples.” It says something like “POS PURCHASE - STAPLES #0231 STORE 02319182.” That string includes the terminal type, store number, and internal reference code. Useful for the bank's fraud detection. Useless for your taxes.
Every bank uses a different format.
Chase uses one column layout. Bank of America uses another. Wells Fargo splits debits and credits into separate columns. Capital One includes a “Category” column that sounds helpful until you realize it labels your coworking space membership as “Entertainment.” There's no standard.
Pending and posted transactions both show up.
Some banks include pending transactions in the export. So you'll see the same $47.32 Amazon purchase twice: once as “PENDING” and once as the final posted version. If you don't catch that, you'll double-count the expense.
Dates and amounts have formatting quirks.
Some exports use MM/DD/YYYY, others use YYYY-MM-DD. Amounts might include dollar signs and commas (“$1,234.56”) that look fine but make Excel treat them as text instead of numbers. You won't notice until you try to sum a column and get zero.
Step-by-Step: Cleaning Up Your CSV
Here's the process for turning a raw bank export into something useful. You can do this in Excel, Google Sheets, or any spreadsheet tool.
1. Open the file and check the columns
First, see what you're working with. Most bank CSVs have at least three columns: date, description, and amount. Some have more (category, balance, check number, memo). Some have blank columns. Some have the headers on row 3 instead of row 1, with bank branding or account info above them.
Delete any rows above the actual headers. Delete any columns you don't need (running balance, check number, etc.). You want to end up with: date, description, amount.
2. Remove duplicate and pending transactions
Search for “PENDING” in the description column and delete those rows. The posted version of each transaction is already in the file. If your bank doesn't label them, sort by date and amount, then look for identical amounts on the same day or a day apart.
3. Fix the amounts
If your bank exports debits and credits in separate columns (Wells Fargo does this), you'll want to combine them into a single “Amount” column where expenses are negative and income is positive. If amounts have dollar signs or commas baked in, use find-and-replace to strip them out so your spreadsheet treats them as numbers.
4. Clean up the descriptions
This is the most tedious part. A description like “CHECKCARD 0314 AMZN MKTP US*2K7X9 AMZN.COM/BILLWA” just means “Amazon.” You need to figure out what each transaction actually is, especially the ones you'll be claiming as business expenses.
A few common translations:
AMZN MKTP US* → Amazon
SQ *COFFEESHOP → Square (Coffee Shop Name)
PAYPAL *ADOBESYS → Adobe (via PayPal)
TST* RESTAURANT → Toast POS (Restaurant Name)
ORIG CO NAME:GUSTO → Gusto Payroll Deposit
5. Standardize the dates
Pick one format and stick with it. If you're filing U.S. taxes, MM/DD/YYYY is conventional. Watch out for dates that Excel may have reinterpreted: “01/02/2024” could be January 2nd or February 1st depending on your system settings. If dates look wrong after opening the CSV, re-import using the “Import Data” wizard where you can specify the date format.
6. Add a category column
Once you can read your transactions, you need to label them. For self-employed tax filing, the categories that matter are the ones on Schedule C: advertising, car and truck expenses, office expenses, supplies, utilities, and so on. Add a new column and start tagging each business transaction with the appropriate category.
This is where most people give up. You're 200 transactions in, you've been at it for an hour, and you still have 800 to go. We'll come back to this.
Common Issues by Bank
Different banks have different quirks. Here are the ones that trip people up most often:
Chase
Defaults to QFX format (not CSV) on the download page, so make sure to switch the file type dropdown. Limits exports to 12 months at a time. Descriptions are relatively clean compared to other banks, but still include prefixes like “ORIG CO NAME” for ACH transactions.
Bank of America
Exports include a “Running Bal.” column that clutters the spreadsheet. Descriptions tend to be heavily truncated, making it harder to identify merchants. Date ranges may be limited to 90 days per download.
Wells Fargo
Splits amounts into separate debit and credit columns. You need to merge them if you want a single amount column. Descriptions are verbose and include internal codes that make the file wider than it needs to be.
Capital One
Includes a “Category” column, which sounds helpful but uses generic consumer categories (Food & Drink, Entertainment) that don't map to Schedule C. The descriptions are often cleaner than other banks, though.
Combining CSVs from Multiple Accounts
If you use more than one bank account or credit card (most self-employed people do), you'll have multiple CSV files to deal with. The challenge is that each one probably has different columns, different formatting, and different description styles.
Before you combine them:
- •Normalize the columns first. Make sure each file has the same three columns in the same order: date, description, amount. Rename headers if needed.
- •Add a “Source” column. Before merging, add a column that identifies which account each transaction came from (“Chase Checking,” “Amex Business,” etc.). Once they're all in one file, you'll want to know where each row originated.
- •Watch for transfers between accounts. If you moved money from checking to savings, that transfer shows up as both an outgoing and incoming transaction. Those aren't expenses or income. Remove them, or at least mark them so they don't inflate your totals.
The Real Problem Is Not the Cleanup
Here's the thing nobody tells you about cleaning bank CSVs: the formatting issues are annoying, but they're solvable in 20 minutes. The part that actually takes hours is what comes after.
You still need to go through every transaction and decide: is this a business expense? If so, what category does it go in? Is this partially deductible (like a phone bill you use for both work and personal)? What about that Amazon purchase from July that could have been office supplies or could have been a birthday present?
That's the work that grinds people down. You're not just cleaning data at that point. You're making tax decisions on every single row, one at a time, for hundreds or thousands of transactions.
A typical self-employed person with one checking account and one credit card has somewhere between 500 and 1,500 transactions per year. Even if you spend just 10 seconds per transaction deciding its category, that's over two hours of focused, tedious work. And 10 seconds per transaction is optimistic when half the descriptions are unreadable.
A Faster Way to Handle All of This
You can absolutely clean your CSVs by hand. The steps above work. But if you're looking at the process and thinking “there has to be a better way,” there is.
Categorize My Expenses was built specifically for this. Upload your bank or credit card CSV (messy formatting and all), and it normalizes the data, cleans up the descriptions, and categorizes each transaction into the correct Schedule C category. It handles the weird merchant codes, the split debit/credit columns, the duplicate pending transactions. You get a clean, categorized breakdown you can hand to your accountant or use to file your own taxes.
Instead of spending hours in a spreadsheet, the whole process takes about two minutes.
Disclaimer: This article is for educational purposes only and does not constitute tax, legal, or financial advice. Tax rules change, and individual situations vary. Consult a qualified tax professional for advice specific to your situation. Categorize My Expenses is a financial data organization tool. It is not a tax preparer and does not provide tax advice.
Related Guides
How to Download Your Bank Transactions as a CSV (2026)
Step-by-step instructions for Chase, Bank of America, Wells Fargo, Capital One, Citi, US Bank, PNC, and Discover, plus what to do if your bank isn't listed.
Read moreHow to Combine Multiple Bank CSVs Into One Spreadsheet for Taxes (2026)
Downloaded CSVs from multiple bank accounts and credit cards? Here's how to merge them into one spreadsheet, fix column mismatches, and get everything ready for Schedule C.
Read moreHow to Categorize Bank Transactions in Excel for Taxes (2026)
A step-by-step guide to categorizing your bank and credit card transactions in Excel or Google Sheets for Schedule C. Includes formulas, category lists, and the honest time math.
Read moreHow AI Categorizes Your Expenses for Taxes (2026)
Why bank statements say AMZN MKTP US instead of Amazon, how AI reads cryptic descriptions, and how automated tools map transactions to Schedule C categories.
Read more