Have you ever received a messy spreadsheet with names in ALL CAPS, or addresses all jumbled together in one cell, or perhaps product codes with extra spaces? That’s where Excel’s text functions come in. They’re the unsung heroes of data cleaning, helping you whip that wayward text into shape and save tons of manual (and often annoyingly repetitive) work.
Let’s get to know some of these power players!
The First Impressions Experts: UPPER(), LOWER(), and PROPER()
These three functions are like the etiquette school of Excel. They help you present your text data in a polished, consistent format.
UPPER()
Syntax: UPPER(text)
How I remember it: UPPER(THISCELL) There’s not much to these, so I couldn’t be too creative.
The UPPER function turns all lowercase letters in a text string or referenced cell into uppercase. It’s perfect when you want to standardize things like acronyms or country codes.
Example: “=UPPER(A2)” makes all the letters in cell A2 uppercase.
LOWER()
Syntax: LOWER(text)
How I remember it: LOWER(this cell)
LOWER is the quiet sibling of UPPER. It converts all uppercase letters in a text string to lowercase. Great for standardizing things like email addresses.
Example: “=LOWER(B2)” makes all the letters in cell B2 lowercase.
PROPER()
Syntax: PROPER(text)
How I remember it: PROPER(This Cell)
The PROPER function capitalizes the first letter of each word in a text string and makes all other letters lowercase. It’s just the ticket for tidying up names or titles.
Example: “=PROPER(C2)” converts the text in cell C2 to proper case.
The Detail-Oriented Tailors: TRIM() and CLEAN()
These two functions are all about tidying up your text data, removing extra spaces and non-printable characters that can mess up your data analysis.
TRIM()
Syntax: TRIM(text)
How I remember it: TRIM(” trim the fat from this text “)
TRIM gets rid of any extra spaces in your text string, except for single spaces between words. It’s excellent for tidying up data copied from the web or another source, since those often have spaces or other whitespace characters attached to the beginning or end.
Example: “=TRIM(A2)” returns the text in cell A2, but with all extra spaces removed.
CLEAN()
Syntax: CLEAN(text)
How I remember it: CLEAN(remove the invisible gunk from this text)
The CLEAN function scrubs away non-printable characters from your text. These are often artifacts from data transferred from a system that uses a different character set, and can take up extra space, prevent lookups from working, and just cause headaches in general.
Example: “=CLEAN(B2)” clears out non-printable characters from the text in cell B2.
LEN()
Syntax: LEN(text)
How I remember it: LEN(how long is this)
The LEN function returns the length of a text string as the number of characters. It’s handy for when you need to sort text entries by length or restrict the length of a text entry. It can also be used in other functions to find the length of a string, as we’ll see up next in our description if REPLACE().
Example: “=LEN(E3)” gives you the length of the text in cell E3.
The Cunning Linguist: REPLACE() and SUBSTITUTE()
When you need to switch out certain bits of your text with something else, these are the functions you need.
REPLACE()
Syntax: REPLACE(old_text, start_num, num_chars, new_text)
How I remember it: REPLACE(in this text, starting from this position, this many characters, put this instead)
The REPLACE function lets you swap out a specific number of characters in your text string, starting from a position you specify. Don’t forget that you can usually make these functions even more dynamic by using other Excel functions within them to identify the starting position and number of characters to replace. This is especially useful when dealing with a dataset where these factors may vary, such as when replacing “Street” with “ST” in a column of address data.
As an example, the FIND function can help identify the start position. FIND returns the starting position of a specific text string within another text string.
Syntax: REPLACE(text, FIND(find_text, within_text, [start_num]), num_chars, new_text)
Example: “=REPLACE(A2, FIND(“Street”, A2, 1), LEN(“Street”), “ST”)”
This formula will locate the starting position of “Street” in the text string (cell A2 in this case) and replace the length of the word “Street” with “ST”. This way, you can ensure that regardless of the length of the address in each cell, “Street” will always be replaced with “ST” correctly. It’s a powerful demonstration of how Excel’s functions can be nested within each other to perform complex, dynamic tasks!
Example: “=REPLACE(A2, 5, 3, “XYZ”)” would replace three characters in cell A2, starting from the fifth character, with “XYZ”.
SUBSTITUTE()
Syntax: SUBSTITUTE(text, old_text, new_text, [instance_num])
How I remember it: SUBSTITUTE(in this text, replace this old part, with this new part, at this instance)
SUBSTITUTE swaps out specific characters or strings of characters within your text, wherever they appear. You can also specify a particular instance (occurrence) of the old text if you don’t want to change them all.
Example: “=SUBSTITUTE(B2, “old”, “new”, 1)” would replace the first instance of “old” in cell B2 with “new”.
The Data Joiner: CONCATENATE() and “&” Operator
These two pretty much do the same thing. They’re both options that let you combine text from multiple cells into a single cell. They’re great for combining things that often come in several columns (like an address being split into street number, street name, city, state, zip, etc.) into a single column.
CONCATENATE()
Syntax: CONCATENATE(text1, [text2], …)
How I remember it: CONCATENATE(just join these together, would ya?)
The CONCATENATE function takes as many text arguments as you’d like and stitches them together into a single text string. This can be particularly handy for creating full names from separate first and last name columns, or combining address information spread across multiple cells.
Example: “=CONCATENATE(A2, ” “, B2)” will combine the text in cells A2 and B2, adding a space between them.
The “&” Operator
A quicker way to join text from different cells is by using the “&” operator. It does the same thing as CONCATENATE but with less typing. Just keep in mind that some users prefer the more official version, because it declares what you’re doing in advance.
How I remember it: “This text ” & “and this” (Just squish ’em together!)
Example: “=A2 & ” ” & B2″ does the same thing as the CONCATENATE example above.
In short, both CONCATENATE and “&” are perfect for tidying up your data and creating new text strings that you can use in reports, emails, or further data manipulation. However, it’s worth noting that Excel has introduced CONCAT as an improved version of CONCATENATE that can support a range of cells (i.e. A1:E1), rather than individual cell references (A1,B1,C1,D1,E1). So, if you’re dealing with large amounts of data, CONCAT might be your new best friend!
In conclusion, Excel’s text functions are truly an asset in your data manipulation toolbox. They might seem complex at first, but once you understand how they work, you can combine them to perform complex tasks, automate tedious tasks, and even solve tricky problems. But this is not the end of the journey; in fact, it’s just the beginning.
These skills you’re honing in Excel aren’t just about mastering this single tool. You’re actually learning a universal language of data processing that applies across many tools and programming languages. This is where the real magic begins to happen. If you decide to venture into the world of coding, you’ll find many parallels, especially in a language like Python.
Python, like Excel, is widely used for data manipulation and analysis, and as an essential part of that, it offers a variety of methods for string (text) manipulation which echo the functions we’ve just learned in Excel. Here are a couple of examples:
Python’s .replace(): Just like Excel’s REPLACE function, Python also has a .replace() method which is used to replace occurrences of a particular substring within a string.
address = "123 Main Street"
print(address.replace("Street", "ST"))
This script would output “123 Main ST” – just like our Excel function!
Python’s concatenation (+ operator): Similarly, Python allows you to concatenate strings using the ‘+’ operator, a bit like using ‘&’ in Excel.
first_name = "John"
last_name = "Doe"
print(first_name + " " + last_name)
This script would output “John Doe”.
Understanding text functions in Excel doesn’t just make you an Excel wizard, it also gives you a head start when stepping into programming languages. The principles of text manipulation remain the same, only the syntax changes. So, as you learn these Excel functions, remember, you’re also dipping your toes into coding! Isn’t that cool?
Remember, Excel is just the beginning – the world of data manipulation and analysis is vast and exciting. Whether you stick with Excel or venture into Python, SQL, R, or other data-focused tools and languages, the fundamental skills you’re learning here will serve you well. So keep practicing, keep exploring, and enjoy the journey!
Happy data cleaning!





Leave a comment