String manipulation (replacingOccurrences) in Linux swift


(Carlos Barrera Hernandez) #1

Hi,

I’m currently using swift to build a project, and have found a pretty hard to bypass roadBlock.

The main workflow of the project is
- Read a csv file
- Extract the values of the rows and columns
- Process data extracted from file
- Write some results to another file

Pretty simple and straightforward. Got it working on OSX with 2016-05-09 (a) toolchain without problems, using the CSV parser from https://github.com/Daniel1of1/CSwiftV with the minimal modifications to compile with swift 3.

But i need it to compile and work with linux ubuntu.
I prepared a work environment with the instructions in https://www.raywenderlich.com/122189/introduction-to-open-source-swift-on-linux
and it works and compiles swift code wonderfully, even installed the same snapshot of the swift toolchain to make it all happy.

The problem i found is some of the string functionality i need for the csv parser to work, is not working ok in linux.
- String.replacingOccurences(of: stringToFind, with: stringToPutInstead)
does not work the same in both environments.

i have a csv file obtainable from https://archive.ics.uci.edu/ml/datasets/default+of+credit+card+clients
That contains the pattern “\r\n” and i need it to be replaced with “\n” to later on process the rest of the file.

the code i use is:

        #if os(Linux)
            print("Let's change those carriage returns...")
            processedString = string.replacingOccurrences(of: "\r\n", with: "\n")
        #else
            processedString = string.replacingOccurrences(of: "\r\n", with: "\n")
        #endif
And it compiles flawlessly in linux, but when it’s time to execute, it throws an error message like:
Let's change those carriage returns...
Illegal instruction (core dumped)
I tried escaping the pattern ( “\\r\\n”) and it does not throw the error, but neither replaces the code
When i tried executing the same code in the Linux virtual machine in the REPL environment, the sequence of events was:

1> import Foundation
  2> let path = "/vagrant/credit_card.csv"
path: String = "/vagrant/credit_card.csv"
  3> let text = NSString(contentsOfFile: path as String, encoding: NSUTF8StringEncoding)
text: Foundation.NSString = {
  Foundation.NSObject = {}
  _cfinfo = {
    info = 1920
    pad = 0
  }
  _storage = "ID,LIMIT_BAL,SEX,EDUCATION,MARRIAGE,AGE,PAY_0,PAY_2,PAY_3,PAY_4,PAY_5,PAY_6,BILL_AMT1,BILL_AMT2,BILL_AMT3,BILL_AMT4,BILL_AMT5,BILL_AMT6,PAY_AMT1,PAY_AMT2,PAY_AMT3,PAY_AMT4,PAY_AMT5,PAY_AMT6,default payment next month\r\n1,20000,2,2,1,24,2,2,-1,-1,-2,-2,3913,3102,689,0,0,0,0,689,0,0,0,0,1\r\n2,120000,2,2,2,26,-1,2,0,0,0,2,2682,1725,2682,3272,3455,3261,0,1000,1000,1000,0,2000,1\r\n3,90000,2,2,2,34,0,0,0,0,0,0,29239,14027,13559,14331,14948,15549,1518,1500,1000,1000,1000,5000,0\r\n4,50000,2,2,1,37,0,0,0,0,0,0,46990,48233,49291,28314,28959,29547,2000,2019,1200,1100,1069,1000,0\r\n5,50000,1,2,1,57,-1,0,-1,0,0,0,8617,5670,35835,20940,19146,19131,2000,36681,10000,9000,689,679,0\r\n6,50000,1,1,2,37,0,0,0,0,0,0,64400,57069,57608,19394,19619,20024,2500,1815,657,1000,1000,800,0\r\n7,500000,1,1,2,29,0,0,0,0,0,0,367965,412023,445007,542653,483003,473944,55000,40000,38000,20239,13750,13770,0\r\n8,100000,2,2,2,23,0,-1,-1,0,0,-1,11876,380,601,221,-159,567,380,601,0,581,1687,1542,0\r\n9,140000,2,3,1,28,0,0,2,0,0,0,11285,14096,12108,12211,11793,"...
}
  4> var tmp = text.replacingOccurrences(of: "\r\n", with: "\n")
tmp: String = {
  _core = {
    _baseAddress = <extracting data from value failed>

    _countAndFlags = <extracting data from value failed>

    _owner = <extracting data from value failed>

  }
}
Execution interrupted. Enter code to recover and continue.
Enter LLDB commands to investigate (type :help for assistance.)

And i found myself without more options to know what’s happening.

Is it a bug i need to report? is there something i’m missing?

What would really be a blessing, is something akin to the xcode swift documentation that covered the linux equivalence of commands.

I know is a young project yet, but it really needs to improve the docs for Linux development to be viable. ( and a GUI like Xcode too :wink: )

If anyone knows anything that can help, i’d be gratefull.

Carlos Barrera
shadowcharly@gmail.com


(Brent Royal-Gordon) #2

        #if os(Linux)
            print("Let's change those carriage returns...")
            processedString = string.replacingOccurrences(of: "\r\n", with: "\n")
        #else
            processedString = string.replacingOccurrences(of: "\r\n", with: "\n")
        #endif
And it compiles flawlessly in linux, but when it’s time to execute, it throws an error message like:
Let's change those carriage returns...
Illegal instruction (core dumped)

I don't have a Swift on Linux box, so I can't really run Corelibs Foundation myself. But looking at the source code <https://github.com/apple/swift-corelibs-foundation/blob/182d5c970114ec7f981aceaaa054d51e29923cf3/Foundation/NSString.swift#L1383>, I'm not sure that it will behave correctly when you replace a string with a differently-sized string and .backwardsSearch is not set. Maybe there's something I'm missing, but it looks to me like each replacement will shift the characters in the string, but the ranges won't move with them. The replacements will miss their targets more and more, and eventually—if the replacement is shorter than the original and one of the matches is near enough to the end—they might run past the end of the string.

Carlos: Try passing `options: .backwardsSearch` to `replacingOccurrences`. If I'm right about this being a bug, doing that should work around it, and you ought not to notice any difference in your code's behavior. (This is not something you should have necessarily figured out yourself; you haven't missed anything obvious.)

swift-corelibs-dev (who I've cc'd on this thread): Am I correct that this is a bug? Should it be/has it been filed? (A quick-and-dirty search of the bug tracker seems to indicate it hasn't been, but I don't have that much experience with it.)

···

--
Brent Royal-Gordon
Architechies


(Tony Parker) #3

Hi Brent,

       #if os(Linux)
           print("Let's change those carriage returns...")
           processedString = string.replacingOccurrences(of: "\r\n", with: "\n")
       #else
           processedString = string.replacingOccurrences(of: "\r\n", with: "\n")
       #endif
And it compiles flawlessly in linux, but when it’s time to execute, it throws an error message like:
Let's change those carriage returns...
Illegal instruction (core dumped)

I don't have a Swift on Linux box, so I can't really run Corelibs Foundation myself. But looking at the source code <https://github.com/apple/swift-corelibs-foundation/blob/182d5c970114ec7f981aceaaa054d51e29923cf3/Foundation/NSString.swift#L1383>, I'm not sure that it will behave correctly when you replace a string with a differently-sized string and .backwardsSearch is not set. Maybe there's something I'm missing, but it looks to me like each replacement will shift the characters in the string, but the ranges won't move with them. The replacements will miss their targets more and more, and eventually—if the replacement is shorter than the original and one of the matches is near enough to the end—they might run past the end of the string.

Carlos: Try passing `options: .backwardsSearch` to `replacingOccurrences`. If I'm right about this being a bug, doing that should work around it, and you ought not to notice any difference in your code's behavior. (This is not something you should have necessarily figured out yourself; you haven't missed anything obvious.)

swift-corelibs-dev (who I've cc'd on this thread): Am I correct that this is a bug? Should it be/has it been filed? (A quick-and-dirty search of the bug tracker seems to indicate it hasn't been, but I don't have that much experience with it.)

Can’t hurt to file a bug. It’ll help us track what we have left, and also give people a place to look for something they can help with.

Thanks,
- Tony

···

On May 26, 2016, at 2:29 PM, Brent Royal-Gordon via swift-corelibs-dev <swift-corelibs-dev@swift.org> wrote:

--
Brent Royal-Gordon
Architechies

_______________________________________________
swift-corelibs-dev mailing list
swift-corelibs-dev@swift.org
https://lists.swift.org/mailman/listinfo/swift-corelibs-dev