Memory leak with String.range(options: .regularExpression) [SR-3536]


(Nethra Ravindran) #1

Hi everyone,

I am working on https://bugs.swift.org/browse/SR-3536

There is a memory leak when searching for the substring of a string using
regular expression.

import Foundation

let myString = "Foo"
for _ in 1...10000 {
  let _ = myString.range(of: "bar", options: .regularExpression)
}

From the above test case i could see that over a period of time, around 60

Mb of memory was leaked.

I see in String.range we eventually call NSString._createRegexForPattern.
Here we maintain a mapping between NSString and NSRegularExpression object
in NSCache<NSString, NSRegularExpression>. All the entries in the cache are
maintained in a dictionary ( _entries ) which takes the UnsafeRawPointer as
the key, which seems to be the address of the NSString Object and
NSCachedEntry as value.

Though the pattern is of type String, it is stored in the NSCache as
NSString. And since we are storing the NSCachedEntry objects in a
dictionary indexed by the address (UnsafeRawPointer) of the NSString
object, there is a new cache entry created for each iteration ( in the test
case ) though the pattern string remains the same.

Can someone guide me about how to go about resolving this issue.

Thank you.

- Nethra Ravindran


(Tony Parker) #2

Hi Nethra,

Hi everyone,

I am working on https://bugs.swift.org/browse/SR-3536
There is a memory leak when searching for the substring of a string using regular expression.

import Foundation

let myString = "Foo"
for _ in 1...10000 {
  let _ = myString.range(of: "bar", options: .regularExpression)
}

From the above test case i could see that over a period of time, around 60 Mb of memory was leaked.

I see in String.range we eventually call NSString._createRegexForPattern. Here we maintain a mapping between NSString and NSRegularExpression object in NSCache<NSString, NSRegularExpression>. All the entries in the cache are maintained in a dictionary ( _entries ) which takes the UnsafeRawPointer as the key, which seems to be the address of the NSString Object and NSCachedEntry as value.

Though the pattern is of type String, it is stored in the NSCache as NSString. And since we are storing the NSCachedEntry objects in a dictionary indexed by the address (UnsafeRawPointer) of the NSString object, there is a new cache entry created for each iteration ( in the test case ) though the pattern string remains the same.

Can someone guide me about how to go about resolving this issue.

Looks like you’ve done most of the analysis, so you’re already pretty much there. =)

Is there some other way we could be caching the results here?

- Tony

···

On Feb 7, 2017, at 11:44 PM, Nethra Ravindran via swift-corelibs-dev <swift-corelibs-dev@swift.org> wrote:
Thank you.

- Nethra Ravindran

_______________________________________________
swift-corelibs-dev mailing list
swift-corelibs-dev@swift.org
https://lists.swift.org/mailman/listinfo/swift-corelibs-dev


(Alex Blewitt) #3

There's a 'cache.countLimit = 10' set on the cache:

https://github.com/apple/swift-corelibs-foundation/blob/16657160c2c441a58ea01bf7baa90607a0b395f7/Foundation/NSString.swift#L109

Shouldn't it start discarding some of the previous entries after it hits the first 10?

Alex

···

On 8 Feb 2017, at 16:51, Tony Parker via swift-corelibs-dev <swift-corelibs-dev@swift.org> wrote:

Hi Nethra,

On Feb 7, 2017, at 11:44 PM, Nethra Ravindran via swift-corelibs-dev <swift-corelibs-dev@swift.org <mailto:swift-corelibs-dev@swift.org>> wrote:

Hi everyone,

I am working on https://bugs.swift.org/browse/SR-3536
There is a memory leak when searching for the substring of a string using regular expression.

import Foundation

let myString = "Foo"
for _ in 1...10000 {
  let _ = myString.range(of: "bar", options: .regularExpression)
}

From the above test case i could see that over a period of time, around 60 Mb of memory was leaked.

I see in String.range we eventually call NSString._createRegexForPattern. Here we maintain a mapping between NSString and NSRegularExpression object in NSCache<NSString, NSRegularExpression>. All the entries in the cache are maintained in a dictionary ( _entries ) which takes the UnsafeRawPointer as the key, which seems to be the address of the NSString Object and NSCachedEntry as value.

Though the pattern is of type String, it is stored in the NSCache as NSString. And since we are storing the NSCachedEntry objects in a dictionary indexed by the address (UnsafeRawPointer) of the NSString object, there is a new cache entry created for each iteration ( in the test case ) though the pattern string remains the same.

Can someone guide me about how to go about resolving this issue.

Looks like you’ve done most of the analysis, so you’re already pretty much there. =)

Is there some other way we could be caching the results here?


(Alex Blewitt) #4

There's a 'cache.countLimit = 10' set on the cache:

https://github.com/apple/swift-corelibs-foundation/blob/16657160c2c441a58ea01bf7baa90607a0b395f7/Foundation/NSString.swift#L109

Shouldn't it start discarding some of the previous entries after it hits the first 10?

Doesn't look like it removes any entries from the cache, nor does the cache work when a String is used as a key. Added info to https://bugs.swift.org/browse/SR-3536

> c.setObject("foo",forKey:"foo")
> c.object(forKey:"foo")
$R6: Foundation.NSString? = nil

Alex

···

On 8 Feb 2017, at 16:59, Alex Blewitt via swift-corelibs-dev <swift-corelibs-dev@swift.org> wrote: