Avoiding double-counting with countif

Microsoft Excel 2007
December 21, 2009 at 11:43:25
Specs: Windows XP
I was wondering if someone could help with the following problem I'm facing in Excel using countif. My problem is bit more complicated, but let me present a simplified version of it. Suppose I have the following column:

A1: apple juice
A2: banana juice
A3: apple banana juice
A4: grape juice

I want to know how many entries contain apple or banana. The answer should be 3, due to A1, A2, and A3. I'm wondering if there's a simple way to do this calculation. If I use countif to count how many times "apple" appears and add that to countif of how many times "banana" appears (countif(A1:A4, "*apple*")+countif(A1:A4, "*banana*")), I get 4 because entry A3 is double counted. Is there a way to use the countif function (or another one) to get the right answer (something like countif(A1:A4, "*apple*" or "*banana*"), which I know doesn't work)? I know there's a more complex way to do it (COUNTIF(A1:A4, "*apple*")+COUNTIF(A1:A4, "*banana*")-COUNTIF(A1:A4, "*apple*banana*")-COUNTIF(A1:A4, "*banana*apple*")), but I'd like to avoid that complexity given the possible combinations I have in my actual problem.

Thanks a lot!

See More: Avoiding double-counting with countif

Report •

December 21, 2009 at 12:36:39
The problem with simplified examples is that they lead to simplified solutions! ;-)

In B1 I entered this formula and dragged it down to B4:

=LEFT(A1,FIND(" ",A1)-1)

This gave me:


Then I used your simple COUNTIF + COUNTIF syntax:


This produced the 3 that you asked for.

Report •

December 21, 2009 at 13:38:44

As i don't know the actual use you want to put a formula to, I can only offer a suggestion that might be of use.

The 'solution' uses Excel's array formulas.
Array formulas are entered into a cell as normal and then with the cursor in the cell, in the formula, hit Ctrl+Shift+Enter (all three keys together). The array formula will then be surrounded by curly braces {=formula}

I used the following data:

1	apple juice
2	banana juice
3	apple banana juice
4	grape juice
5	grape peach juice
6	juice banana apple

Note that in this solution the order of words is not important.

In another cell I entered this formula:


With the cursor in the cell, hit the three keys Ctrl+Shift+Enter and you get:

I have split both formulas onto two lines for ease of viewing.

This formula returns the value 4.

To make these formulas more useful, the text of the words to find, can be replaced by cell references - but remember that every time you change the formula you must use the three key combination.

Using cell references allows a matrix to be used:

	A	B	C	D	E
		apple	banana	grape	peach
18	apple	3	4	5	4
19	banana	4	3	5	4
20	grape	5	5	2	2
21	peach	4	4	2	1

The formula in cell B18 (after using Ctrl+Shift+Enter) is:
(all one line)
Once the array formula has been created in cell B18 it can be extended by dragging as usual.



Report •

December 22, 2009 at 06:10:46

Here is a user defined function that will count cells with at least one of the words in them. No double counting, words can be in any order, and the search / count can be case-sensitive or case insensitive.

Once the UDF has been written, the formula used to do the count is simple, e.g., =UCT(A1:A20, B1, 1)

Here is the code, which is entered in a standard module in the Visual Basic window in Excel (Alt+f11):

Public Function UCT(rngRange As Range, Words As String, _
                Optional intCase As Integer = 1) As Variant
' Unique Cell Text function
' Returns number of cells containing any or all the words passed in 'Words'
' Words is a comma delimited string of words
' Case is Zero or One. If One (default) the test is not case sensitive
Dim strWordsArry() As String
Dim rngCell As Range
Dim blnFoundOne As Boolean
Dim intCde As Integer
Dim m, n As Integer

On Error GoTo ErrHnd

'check that 'Words' is not empty
If Len(Words) = 0 Then
    UCT = CVErr(xlErrNA)
    Exit Function
End If

'get number of words in 'Words'
m = 0
For n = 1 To Len(Words)
    If Mid(Words, n, 1) = "," Then m = m + 1
Next n

'resize array
ReDim strWordsArry(m)

'put words in array - ignore spaces and non-alpha-numeric characters
m = 0
For n = 1 To Len(Words)
    If Mid(Words, n, 1) <> "," Then
        intCde = Asc(Mid(Words, n, 1))
            If (intCde > 47 And intCde < 58) Or (intCde > 64 And intCde < 91) _
                    Or (intCde > 96 And intCde < 123) Then
                strWordsArry(m) = strWordsArry(m) + Mid(Words, n, 1)
            End If
        'next word
        m = m + 1
    End If
Next n

'loop through each cell in range
For Each rngCell In rngRange
    'test each cell for each word
    blnFoundOne = False
    For n = 0 To UBound(strWordsArry())
        If InStr(1, rngCell.Text, strWordsArry(n), intCase) > 0 Then
            'one of the words in 'Words' found in the cell
            UCT = UCT + 1
            blnFoundOne = True
        End If
        'only need to find one of the words in a cell
        If blnFoundOne Then Exit For
    Next n
Next rngCell
Exit Function

'error handler
UCT = CVErr(xlErrValue)
End Function

I have only tested this on the following data:
In cells A1 to A7
apple juice
banana juice
apple banana juice
grape juice
grape peach juice
juice banana apple
peach, orange
In cells C1 and D1 enter the words to be searched for and in cell C2 enter:
Extend the formula to D2.
The zero at the end of the formula is for a case-sensitive match. Omitted or 1 (default) results in a case-insensitive match.
This was the result of a case-sensitive search/count:
	C		D
1	Apple, grape	apple,grape,banana
2	2		6


Report •

Related Solutions

Ask Question