How to check if a value in a dictionary has duplicates?

The algorithm below checks to see if an array has at least two or more duplicates. It uses a dictionary to store the occurrences; the time complexity is linear because it has to traverse the dictionary to see if a key occurs twice. In swift, how can I look up a value to see if it occurs more than twice in constant time ?

 func containsDuplicate(_ nums: [Int]) -> Bool {
        var frequencyTable = [Int:Int]()
        
        for num in nums {
            frequencyTable[num] = (frequencyTable[num] ?? 0 ) + 1
        }
        
        
        for value in frequencyTable.values{
            if value >= 2 {
                return true
            }
            
        }
        return false
    }
    
    containsDuplicate([1,1,2,3,3,3,3,4])

If you are only trying to detect the presence of at least one duplicated element in a collection, then instead of constructing the entire frequency table, a better approach is to stop early as soon as the first duplicate is found:

extension Collection where Element: Hashable {
  func containsDuplicates() -> Bool {
    var alreadySeen: Set<Element> = []
    
    for x in self {
      if !alreadySeen.insert(x).inserted {
        return true
      }
    }
    
    return false
  }
}
2 Likes

Edit: Never mind… while this should work, there should be faster algorithms.

Without giving it too much thought (by which I mean I haven't researched whether there are any "known best" ways to do this), I'd probably write something like

func containsDuplicate(_ nums: [Int]) -> Bool {
  for num in nums {
    if nums.firstIndex(of: num) != nums.lastIndex(of: num) {
      return true
    }
  }
  return false
}

i'd just compare the count of set to the count or the array:

func containsDuplicate(_ nums: [Int]) -> Bool {
     Set(nums).count != nums.count
}

but don't do this for floats.

Isn’t this quadratic?

2 Likes

Oh! Yes, I believe so… My mistake, for some reason I was thinking that firstIndex(of:) and lastIndex(of:) where both constant time, but the docs say they’re O(n). So that would make my method O(2n²), I believe, in the worst case where there’s only one duplicated element and they’re both at the end of the array.

1 Like

You can't check an array for duplicates in constant time. Constant time means you can't consider each element in the array, so to achieve this you will need to calculate whether duplicates exist in advance, e.g. while adding items to the array, and store the result for O(1) lookup.

If you are able to use NSCountedSet, it provides a very nice way to do this: When an item is added to the array, add it to the counted set as well. When an item is removed from the array, remove it from the counted set. If the counted set's count is less than the array's, there is at least one duplicate item.

If you can't or don't want to use NSCountedSet, it's fairly easy to implement one based on a dictionary.

1 Like

Here's an equivalent for Element: Hashable. The performance is obviously a lot worse.

extension Sequence where Element: Equatable {
  func containsDuplicates() -> Bool {
    var alreadySeen: [Element] = []
    
    for element in self {
      if alreadySeen.contains(element) {
        return true
      }
      alreadySeen.append(element)
    }
    
    return false
  }
}

Worth noting that @Nevin's example can also be pushed down to work on Sequence.

Not that the constants usually matter in Big-O notation, but I think it’s actually closer to O(n²).

Since firstIndex(of:) and lastIndex(of:) iterate over the array from different ends, the total number of elements visited would always be at most n + 1, not 2n. (For example, [1, 2, 3, 4, 5].firstIndex(of: 4) will visit 1, 2, 3, 4 and [1, 2, 3, 4, 5].lastIndex(of: 4) will visit 5, 4.)

So still bad, but not double-bad. :slightly_smiling_face:

Dictionary got initializers four years ago to make this simple. The equivalent never came to Set or uniqued. :crying_cat_face:

extension Sequence where Element: Hashable {
  var containsDuplicate: Bool {
    do {
      try Dictionary(
        zip(self, AnyIterator { })
      ) { _, _ in
        throw Error()
      }

      return false
    } catch {
      return true
    }
  }
}

private struct Error: Swift.Error { }

e.g.

public extension SetAlgebra {
  /// - Throws: if `elements` contains any duplicates.
  init<Elements: Sequence>(unique elements: Elements) throws
  where Elements.Element == Element {
    self = try elements.reduce(into: .init()) {
      guard $0.insert($1).inserted
      else { throw Error() }
    }
  }
}

ok, i reread the question.. the first part of the question describes an algorithm that "checks to see if an array has at least two or more duplicates". as stated, this:

[1, 2, 1] // has only one duplicate, so shall return false
[1, 2, 1, 2, 3] // has two duplicates, so shall return true

perhaps it is a typo, and the question should have been: "checks to see if an array has at least one duplicate".

the actual algorithm presented is written in a way to find at least one duplicate, which supports the assumption that there was a typo in the statement (two+ duplicates vs one+ duplicate).

so far so good.

now this. as stated, you need to look up A VALUE (* where exactly?) to see if it occurs more than TWICE. and do it in constant time. ok, so you just use the resulting dictionary (* assuming that by "looking up a value" you mean to look it up in the already calculated dictionary) and do this in constant time:

frequencyTable[num] > 2 // "more than twice" check, O(1)

and all the answers above answering a question how to check if array has duplicates are answering a different question that wasn't asked!

1 Like
Terms of Service

Privacy Policy

Cookie Policy