Solved How would I remove text from the begining of data in cell?

Microsoft Office 2010 professional
February 8, 2016 at 16:32:14
Specs: Windows 64
I have a spreadsheet that comes loaded with data but before the company name, there's a line of text that has nothing to do with my purpose and I need to just get rid of it. It will always be a set number of spaces, 30, which counts empty spaces and text. I thought Find Replace would be the easiest way to do it but can't figure it out.

Any ideas? Macro, formula, Replace? It will be A2-A200 and the cell data will look like this;


NML: 098765432 Customer Name:


See More: How would I remove text from the begining of data in cell?

Report •

✔ Best Answer
February 10, 2016 at 16:57:24
Good try, you are close on some parts, but let's break it down and look at the details.

Let me start by saying that what I trying to do was eliminate the hard-coded numbers (20, 31, 50, etc.) and let the formula eliminate the extraneous text by finding the colons. I was not entirely successful in doing that (the 14 is still a hard coded number) and I'll explain why later. I posted the slightly faulty formula anyway just as a learning experience.

Whatever you learn from this, you'll be able to use later and I think that is what you might enjoy that. (In fact, I learned something while doing this, so I'm glad I tried it.)

OK, Let’s start with the overall MID function and break down the arguments

MID(text, start_num, num_chars)

Text: A1 - obviously ;-)
start_num: FIND(":",A1,FIND(":",A1)+1)+2
num_chars: coming later

How that works…

FIND(find_text, within_text, [start_num])

find_text: ":"

within_text: A1

[start_num]: FIND(":",A1)+1)+2

By using a second FIND function as the [start_num] argument for the first FIND, I am telling Excel to start looking for a colon after the position of the first colon.

FIND(":",A1)+1 means return the position of the colon and then add 1, giving us a [start_num] that is one position past the first colon. This returns the position of the second colon.

e.g. In the string 12:45: the first colon is in position 3, the second is in position 6 therefore FIND(":",A1,FIND(":",A1)+1) means to find a colon in A1 but start looking at position 4, not position 1. In essence, the first FIND doesn’t even know about the first colon because it doesn’t start looking for a colon until position 4, based on the value returned by the second FIND.

Then all we do is add 2 to that "position" to find the location of the start_num for the MID function. (2 positions past the second colon.)

OK, now let’s determine how many characters the MID function should extract:

num_chars: FIND(CHAR(1),SUBSTITUTE(A1,":",CHAR(1),3))-14-FIND(":",A1,FIND(":",A1)+1)

In pieces:

FIND(CHAR(1),SUBSTITUTE(A1,":",CHAR(1),3))

This is a quick way to find the third colon in a more direct manner than using multiple finds. Allow me to explain.

CHAR(number)

This function returns the character assigned to the ASCII number used by the function.

e.g. =CHAR(65) would place an “A” in the cell.

CHAR(1) returns a character that would normally not be be found in an Excel cell, since it is not something that can be typed in (SOH, start of heading). (More on that later)

SUBSTITUTE(text, old_text, new_text, [instance_num])

SUBSTITUTE(A1,":",CHAR(1),3)

text: A1
old_text : a colon
new_text: CHAR(1)
[instance_num]: 3

This will “substitute” an SOH character for the third colon found in A1. CHAR(1) (SOH) is a safe character to use, since the odds of it already existing in the cell is extremely slim. The SUBSTITUTE function doesn’t change A1; it just changes what A1 “looks like” to whatever function is about to reference A1.

FIND(CHAR(1),SUBSTITUTE(A1,":",CHAR(1),3))

In other words, “Find an SOH character in whatever A1 would look like if you substituted an SOH character for the third colon. ;-)

OK, so now the FIND function has found the position of the third colon. Now we subtract 14 from that to account for “Instance ID” then subtract the number of characters up to the space after the second colon, leaving us with the number of characters in the string we want to extract.

QED

Now, as I said before, my goal was to eliminate all hard coded numbers so that the formula could be used regardless of the number of characters in the entire string or in the CIN value or anything else. I was trying to extract the string solely based on the locations of the colons. Unfortunately, by the time I got burnt out from working on this, I couldn’t come up with a way to deal with the fact that third colon comes after the Instance ID and we had to back up from there. In other words, I couldn’t figure out how to have the formula determine the length of “Instance ID” by itself. For example, as written, if “Instance ID:” suddenly becomes “Instance ID Number:” my formula will fail.

I’m sure it can be done, but as of now I don’t know how. Work on that for us, would you? ;-)

Click Here Before Posting Data or VBA Code ---> How To Post Data or Code.



#1
February 8, 2016 at 17:40:48
Try this:

In the Find box enter: * Customer Name
That is the wild card symbol star * then the Customer Name

In the Replace box enter: Customer Name

That is just the Customer Name

See how that works.

MIKE

http://www.skeptic.com/


Report •

#2
February 8, 2016 at 17:46:29
You could try this:

=RIGHT(A2,LEN(A2)-30)

The RIGHT function take 2 arguments:

RIGHT(text,[num_chars])

e.g.
=RIGHT("Bob",2) would return "ob"
=RIGHT(A1,2) would return the 2 right most characters in A1

Since we don't know how many characters we need to return, we can let the LEN function determine that. The LEN function returns the number a characters in a cell.

Since we know that we want 30 less characters than are in the cell, we can use LEN(A2)-30 as the [num_chars] argument of the RIGHT function.

With NML: 098765432 Customer Name: Tom in A2, we would get:

Tom

Click Here Before Posting Data or VBA Code ---> How To Post Data or Code.


Report •

#3
February 8, 2016 at 17:50:22
Mike,

I believe that she has something like this, where the entire string she posted needs to be stripped off. (I think those are the 30 characters she mentioned.)

NML: 098765432 Customer Name: Tom
NML: 023546982 Customer Name: Mike
NML: 056982134 Customer Name: Derby
NML: 056859725 Customer Name: Ginger

Click Here Before Posting Data or VBA Code ---> How To Post Data or Code.

message edited by DerbyDad03


Report •

Related Solutions

#4
February 8, 2016 at 17:52:53
Oh yeah, if she has what I think she does, she could also use the Data...Text-To-Columns feature and use the colon as a delimiter or a "customized" fixed width Text-To-Columns.

Click Here Before Posting Data or VBA Code ---> How To Post Data or Code.


Report •

#5
February 8, 2016 at 18:34:09
DerbyDad,

I thought of that after I posted.
Was waiting to see her response,
but I don't think the Find & Replace will be what she needs.

MIKE

http://www.skeptic.com/


Report •

#6
February 9, 2016 at 05:52:07
The Find & Replace could work if colon is the delimiter.

So the Find box would be: *: <space>

That is the wild card symbol star *
then the colon : delimiter
and a literal space character

The Replace would be blank

If there are two colons, run the Find & Replace twice.
or just do *:*:<space>

MIKE

http://www.skeptic.com/

message edited by mmcconaghy


Report •

#7
February 9, 2016 at 09:26:23
Tom/Mike, thank you for your ideas. Watching you two go back and forth is a learning experience all on its own.

The RIGHT function worked well but made me greedy for more. I played around with LEFT and got the end 20 characters removed but haven't been able to get RIGHT and LEFT to work together.

This is one cell:
CIN: 299990399 Customer Name: PARK CENTRAL HYDRANTS OF MANHATTAN Instance ID: 1998PP
And here's another:
CIN: 002233668 Customer Name: MLK LLC Instance ID: 555JJ6

So you can see the size of the data varies with nothing set except for the first 30 characters and the last 20. Ideally I'd like to shave off both ends and just have the company name.

Is there a formula for that?

Ginger


Report •

#8
February 9, 2016 at 12:26:48
Here are two ways:

Using the MID() function, which is the sister/brother/cousin
of the LEFT() & RIGHT() functions:

=MID(A1,31,LEN(A1)-50)

I'll let you figure out why I used the numbers I did. :-)

Second way is to use Find & Replace twice:

First - Find box would be: *: <space>
Replace with nothing

Second - Find box would be: <space> Instance*
Replace with nothing

The advantage of the Find & Replace is you do not need a helper column,
nor do you need to do a Copy / Save Values.

MIKE

http://www.skeptic.com/


Report •

#9
February 9, 2016 at 13:01:42
I confess, I only tried the MID because I love the simplicity of formula.

And the 31 must be where the beginning of the desired text begins and the -50 is telling Excel if 30 is out at the beginning of the text, then 20 would have to be removed from the end?

This worked great, thank you, Mike!

Ginger


Report •

#10
February 9, 2016 at 13:19:09
Your analysis of the formula is correct.
Glad we could help.

MIKE

http://www.skeptic.com/

message edited by mmcconaghy


Report •

#11
February 9, 2016 at 18:25:36
What is the name of that movie? Oh yeah!

Analyze This...

=MID(A1,FIND(":",A1,FIND(":",A1)+1)+2,FIND(CHAR(1),
SUBSTITUTE(A1,":",CHAR(1),3))-14-FIND(":",A1,FIND(":",A1)+1))

Have fun!

Click Here Before Posting Data or VBA Code ---> How To Post Data or Code.


Report •

#12
February 10, 2016 at 06:57:12
CIN: 299990399 Customer Name: PARK CENTRAL HYDRANTS OF MANHATTAN Instance ID: 1998PP

A challenge....okay, using this string as A1, the Find is looking for the colon, then this [MID(A1,FIND(":",] is looking for the first colon and [(":",A1,FIND(":",A1)+1)+2,] is looking for the second and third colons?

then [FIND(CHAR(1),] is looking for the first character in the cell, the letter C in CIN?

and [SUBSTITUTE(A1,":",CHAR(1),3)] is replacing the C with a colon?

and [)-14]is removing the first 14 characters which would take everything before the Park?

then [-FIND(":",A1,FIND(":",A1)+1))] must be saying to do it at the end of the text too but I don't understand this part. How does it know to get rid of the Instance ID: 1998PP


Report •

#13
February 10, 2016 at 16:57:24
✔ Best Answer
Good try, you are close on some parts, but let's break it down and look at the details.

Let me start by saying that what I trying to do was eliminate the hard-coded numbers (20, 31, 50, etc.) and let the formula eliminate the extraneous text by finding the colons. I was not entirely successful in doing that (the 14 is still a hard coded number) and I'll explain why later. I posted the slightly faulty formula anyway just as a learning experience.

Whatever you learn from this, you'll be able to use later and I think that is what you might enjoy that. (In fact, I learned something while doing this, so I'm glad I tried it.)

OK, Let’s start with the overall MID function and break down the arguments

MID(text, start_num, num_chars)

Text: A1 - obviously ;-)
start_num: FIND(":",A1,FIND(":",A1)+1)+2
num_chars: coming later

How that works…

FIND(find_text, within_text, [start_num])

find_text: ":"

within_text: A1

[start_num]: FIND(":",A1)+1)+2

By using a second FIND function as the [start_num] argument for the first FIND, I am telling Excel to start looking for a colon after the position of the first colon.

FIND(":",A1)+1 means return the position of the colon and then add 1, giving us a [start_num] that is one position past the first colon. This returns the position of the second colon.

e.g. In the string 12:45: the first colon is in position 3, the second is in position 6 therefore FIND(":",A1,FIND(":",A1)+1) means to find a colon in A1 but start looking at position 4, not position 1. In essence, the first FIND doesn’t even know about the first colon because it doesn’t start looking for a colon until position 4, based on the value returned by the second FIND.

Then all we do is add 2 to that "position" to find the location of the start_num for the MID function. (2 positions past the second colon.)

OK, now let’s determine how many characters the MID function should extract:

num_chars: FIND(CHAR(1),SUBSTITUTE(A1,":",CHAR(1),3))-14-FIND(":",A1,FIND(":",A1)+1)

In pieces:

FIND(CHAR(1),SUBSTITUTE(A1,":",CHAR(1),3))

This is a quick way to find the third colon in a more direct manner than using multiple finds. Allow me to explain.

CHAR(number)

This function returns the character assigned to the ASCII number used by the function.

e.g. =CHAR(65) would place an “A” in the cell.

CHAR(1) returns a character that would normally not be be found in an Excel cell, since it is not something that can be typed in (SOH, start of heading). (More on that later)

SUBSTITUTE(text, old_text, new_text, [instance_num])

SUBSTITUTE(A1,":",CHAR(1),3)

text: A1
old_text : a colon
new_text: CHAR(1)
[instance_num]: 3

This will “substitute” an SOH character for the third colon found in A1. CHAR(1) (SOH) is a safe character to use, since the odds of it already existing in the cell is extremely slim. The SUBSTITUTE function doesn’t change A1; it just changes what A1 “looks like” to whatever function is about to reference A1.

FIND(CHAR(1),SUBSTITUTE(A1,":",CHAR(1),3))

In other words, “Find an SOH character in whatever A1 would look like if you substituted an SOH character for the third colon. ;-)

OK, so now the FIND function has found the position of the third colon. Now we subtract 14 from that to account for “Instance ID” then subtract the number of characters up to the space after the second colon, leaving us with the number of characters in the string we want to extract.

QED

Now, as I said before, my goal was to eliminate all hard coded numbers so that the formula could be used regardless of the number of characters in the entire string or in the CIN value or anything else. I was trying to extract the string solely based on the locations of the colons. Unfortunately, by the time I got burnt out from working on this, I couldn’t come up with a way to deal with the fact that third colon comes after the Instance ID and we had to back up from there. In other words, I couldn’t figure out how to have the formula determine the length of “Instance ID” by itself. For example, as written, if “Instance ID:” suddenly becomes “Instance ID Number:” my formula will fail.

I’m sure it can be done, but as of now I don’t know how. Work on that for us, would you? ;-)

Click Here Before Posting Data or VBA Code ---> How To Post Data or Code.


Report •

Ask Question