Empower String type with regular expression

Others have suggested a programatic regex instead of a regex literal, how
about doing both? Something like:

enum RegexElement {

    case capture(name: String, value: String)

    case special(Special)

    // ...

    enum Special: String {

        case startOfLine = "^"

        // ...

        case endOfLine = "$"

    }

}

// Define a regexLiteral syntax that the compiler understands that is of
type Regex and consists of String representations of RegexElements, e.g.
using forward slash:

// /<RegexElements>*/

struct Regex: CustomStringConvertible { // Compiled, immutable, thread
safe, and bridged to NSRegularExpression

    // ... internal compiled representation

    let elements: [RegexElement]

    var description: String {

        return RegexElement.Special.startOfLine.rawValue // Example. Really
returns all the elements converted back to a string

    }

    init(_ elements: RegexElement...) {

        self.elements = elements // Example. Really also compiles the
expression

    }

    // init(regexLiteral regex: Regex) {

    // init(concatAll regexes: Regex...) {

    // init(fromString string: String) {

    // ... more inits

    func map<T>(input: String, @noescape mapper: (element: RegexElement)
throws -> T) rethrows -> [T] {

        return [try mapper(element: RegexElement.special(.startOfLine))] //
Example. Really does the matching

    }

    // func flatMap<T>(input: String, @noescape mapper: (element:
RegexElement) throws -> T?) rethrows -> [T] {

    // func flatMap<S: SequenceType>(input: String, @noescape mapper:
(element: RegexElement) throws -> S) rethrows -> [S.Generator.Element] {

    // func forEach(input: String, @noescape eacher: (element:
RegexElement) throws -> Void) rethrows {

    // ... more funcs

}

let regex = Regex(RegexElement.special(.startOfLine)) // Normally a regex
literal

let asStringArray = regex.map("Example") { element -> String in // Returns
`["^"]` in example

    switch element {

    case let .capture(_, v): return v

    case let .special(s): return s.rawValue

    }

}

The advantages are:

   1. We get a literal type for convenience.
   2. We get a programatic type when we need to manipulate regexes.
   3. Breaking the regex matches into the enum defined elements of the
   regex works well with Swift pattern matching.

(Above is a very rough sketch!)

···

On 2 February 2016 at 16:44, Thorsten Seitz via swift-evolution < swift-evolution@swift.org> wrote:

Something like Scala's extractors or F#'s Active Patterns would be most
welcome to generalize pattern matching.

Redirecting…
F Sharp Programming/Active Patterns - Wikibooks, open books for an open world

-Thorsten

Am 01.02.2016 um 15:46 schrieb James Campbell via swift-evolution < > swift-evolution@swift.org>:

It would be great if we could create a generic way of making this swifty.
You may let say want to implement a matching system for structure like JSON
or XML (i.e XQuery).

*___________________________________*

*James⎥Lead Engineer*

*james@supmenow.com <james@supmenow.com>⎥supmenow.com
<http://supmenow.com>*

*Sup*

*Runway East *

*10 Finsbury Square*

*London*

* EC2A 1AF *

On Mon, Feb 1, 2016 at 2:43 PM, Patrick Gili via swift-evolution < > swift-evolution@swift.org> wrote:

Hi Dany,

My response is inline below.

Cheers,
-Patrick

On Jan 31, 2016, at 8:56 PM, Dany St-Amant <dsa.mls@icloud.com> wrote:

Le 31 janv. 2016 à 16:46, Patrick Gili <gili.patrick.r@gili-labs.com> a
écrit :

Hi Dany,

Please find my response inline below.

Cheers,
-Patrick

On Jan 31, 2016, at 3:46 PM, Dany St-Amant via swift-evolution < >> swift-evolution@swift.org> wrote:

This seem to be two proposals in one:
1. Initialize NSRegularExpression with a single String which includes
options

The ultimate goal based on the earlier mail in the thread seems to be
able in a future proposal do thing like: string ~= replacePattern, if
string =~ pattern, decoupled from the legacy Obj-C. Isn’t
NSRegularExpression part of the legacy? The conversion of the literal
string as regular expression should probably part of the proposal for these
operators; as this is the time we will know how we want the text to be
interpreted.

I don't see any evidence of NSRegularExpression becoming part of any
legacy. Given SE-005, SE-006, and SE-023, the name is probably changing
from NSRegularExpression to RegularExpression. However, I don't think the
definition of the class will change, only the name.

I would like to see an operator regular expression matching operator,
like Ruby and Perl. I was trying to keep the proposal a minimal increment
that would buy the biggest bang for the buck. We can already accomplish
much of what other languages can do with regard to regular expression.
However, the notion of a regular expression isn't something we can work
around with custom library today. Can you suggest something addition that
should be in the proposal?

Splitting proposal in smaller ones have its advantage, but here I am just
wondering if we are sure that these future operation will use the
NSRegularExpression/RegularExpression. And does the currently selected
syntax allow for future expansion, it would be bad to introduce something
that need to be torn away or changed in an incompatible way, once we
really start to use them in their final location.

The proposal is focused on the search, but seem to skip the substitution;
I am unable to see an option to replace all matches instead of the first
one only in the proposal. I, as many other, would expect regular expression
in a language to also support substitution.

As for addition to the proposal, the processing of the string could be
support for any character (within some limit) for the slash delimiter. With
sed, when replacing path component, one can do: echo $PWD | sed -e
"s:^/usr/local/bin:/opt/share/bin:g", instead of escaping every single
slashes. Which is really handy to make thing easier to read.

Also, putting aside that I think \(scheme) should not be interpreted in
the example, with a syntax allowing such interpretation the variable should
be processed to generate proper escaping. If one is to use \(filename) you
get "main.c", but one must use \(filename.escaped()) to get the proper
"main\.c" to avoid matching "mainac". The String.escaped() must be in a
format compatible with the format used when converting the regular
expression into NSRegularExpression (not sure if the two syntax are the
same; I think that at least the handling of / may differ)

I agree. Perhaps I went too far with keeping the proposal
short-and-sweet. Especially when you consider the rich syntax that Perl
supports for substitution.

2. Easily create a String without escaping (\n is not linefeed, but \ and
n)

The ability to not interpret the backslash as escape can be useful in
other scenario that creating a NSRegularExpression; like creating a Windows
pathname, or creating regular expression which are then given to external
tool. So this part of the proposal should probably be generalized.

Generalize it for what? If you're thinking along the line of raw strings,
I agree that we need this capability, as well as multi-line string
literals. However, I just soon we have separate proposals for this.

My point/opinion here, is that a regular expressions are just a String
which are then interpreted; the same way as "Good Morning", "Bonjour", or
"Marhaba" (even when using the arabic script) are just String when you
assign then to a variable in Swift, and then interpreted by the intended
user. They are not String, frenchString, rigthToLeftString. So I do not see
why a regular expression should have privileged treatment and have its own
language level syntax. The only difference when writing regular expression,
or Windows pathname, or any String with a syntax with heavily uses of
backslashes, is that one may want to disable the special meaning of the
backslashes, to make thing more readable.

On the page of geeky-ing the String there’s four main part IMHO
- multi-line support
- no backslash escaping version (which should include no processing the
\(variable) format)
- inclusion of String delimiter inside the String
- concat of backslash/no backslash version. Bash example echo 'echo
"$BASH" shows '"$BASH"

I’m still trying to find back the mail thread crumbs on these topics,
since before restarting the discussion in these topics, the previous one
should be properly summarized; unless such summary already exist.

I think supporting interpolation is important. Both Perl and Ruby support
it, and I'm sure there are other languages. One thing I forgot to put into
the proposal: an option to disable interpolation or limit it to single pass.

Looking ahead at the other responses, Chris Lattner has suggested that
the proposal would have more traction if we can find a way to fold this
into Swift's pattern matching. I can't say as I disagree, as this makes
regular expression more Swifty.

Regards,
Dany

Dany

Le 31 janv. 2016 à 12:18, Patrick Gili via swift-evolution < >> swift-evolution@swift.org> a écrit :

Here is the link to the proposal on GitHub:

https://github.com/gili-patrick-r/swift-evolution/blob/master/proposals/NNNN-regular-expression-literals.md

Cheers,
-Patrick

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution

--
  -- Howard.

This is pretty interesting, and I agree with the general observation here: there is a duality between printing and parsing that should be modeled and exploited.

One of the things that we discussed in the time leading up to Swift 1 (but then pushed off and never came back to) was model for doing standardized formatted printing - capturing the power and expressiveness of printf, without the problems it entails. A sketch of the proposal is here:

The pertinent idea here is that types can have a default printing rule (e.g. integers print in decimal) but then can opt into providing a more powerful formatting model by supporting modifier characters (for example “x” for an integer could mean “hexadecimal”). This would be protocol based, so arbitrary types could participate.

If you bring the same concept to regex parsing, I think it would make a lot of sense for primitive types to support “default” regex rules (e.g. integers would default to /[0-9]+/ ) and then have modifier characters that support other standard modes for them (e.g. x for hexadecimal). This would obviously want to be extensible to arbitrary types, so that (e.g.) NSDate could support the format families that make sense.

Going with this would address one of my major gripes with regex’s used in practice, which is that they are a huge violation of the DRY principle, and that reuse of regex patterns almost never happens.

-Chris

···

On Feb 1, 2016, at 9:44 PM, Thorsten Seitz via swift-evolution <swift-evolution@swift.org> wrote:

Something like Scala's extractors or F#'s Active Patterns would be most welcome to generalize pattern matching.

Redirecting…
F Sharp Programming/Active Patterns - Wikibooks, open books for an open world

Hi Howard,

I don't see how this is very different from the Swift Verbal Expressions. It would suffer from the same disadvantages I have stated previously.

Cheers,
-Patrick

···

On Feb 2, 2016, at 1:51 AM, Howard Lovatt via swift-evolution <swift-evolution@swift.org> wrote:

Others have suggested a programatic regex instead of a regex literal, how about doing both? Something like:

enum RegexElement {
    case capture(name: String, value: String)
    case special(Special)
    // ...
    enum Special: String {
        case startOfLine = "^"
        // ...
        case endOfLine = "$"
    }
}

// Define a regexLiteral syntax that the compiler understands that is of type Regex and consists of String representations of RegexElements, e.g. using forward slash:
// /<RegexElements>*/

struct Regex: CustomStringConvertible { // Compiled, immutable, thread safe, and bridged to NSRegularExpression
    // ... internal compiled representation
    let elements: [RegexElement]
    var description: String {
        return RegexElement.Special.startOfLine.rawValue // Example. Really returns all the elements converted back to a string
    }
    init(_ elements: RegexElement...) {
        self.elements = elements // Example. Really also compiles the expression
    }
    // init(regexLiteral regex: Regex) {
    // init(concatAll regexes: Regex...) {
    // init(fromString string: String) {
    // ... more inits
    func map<T>(input: String, @noescape mapper: (element: RegexElement) throws -> T) rethrows -> [T] {
        return [try mapper(element: RegexElement.special(.startOfLine))] // Example. Really does the matching
    }
    // func flatMap<T>(input: String, @noescape mapper: (element: RegexElement) throws -> T?) rethrows -> [T] {
    // func flatMap<S: SequenceType>(input: String, @noescape mapper: (element: RegexElement) throws -> S) rethrows -> [S.Generator.Element] {
    // func forEach(input: String, @noescape eacher: (element: RegexElement) throws -> Void) rethrows {
    // ... more funcs
}

let regex = Regex(RegexElement.special(.startOfLine)) // Normally a regex literal
let asStringArray = regex.map("Example") { element -> String in // Returns `["^"]` in example
    switch element {
    case let .capture(_, v): return v
    case let .special(s): return s.rawValue
    }
}

The advantages are:
  We get a literal type for convenience.
  We get a programatic type when we need to manipulate regexes.
  Breaking the regex matches into the enum defined elements of the regex works well with Swift pattern matching.
(Above is a very rough sketch!)

On 2 February 2016 at 16:44, Thorsten Seitz via swift-evolution <swift-evolution@swift.org <mailto:swift-evolution@swift.org>> wrote:
Something like Scala's extractors or F#'s Active Patterns would be most welcome to generalize pattern matching.

Redirecting…
F Sharp Programming/Active Patterns - Wikibooks, open books for an open world

-Thorsten

Am 01.02.2016 um 15:46 schrieb James Campbell via swift-evolution <swift-evolution@swift.org <mailto:swift-evolution@swift.org>>:

It would be great if we could create a generic way of making this swifty. You may let say want to implement a matching system for structure like JSON or XML (i.e XQuery).

___________________________________

James⎥Lead Engineer

james@supmenow.com <mailto:james@supmenow.com>⎥supmenow.com <http://supmenow.com/&gt;
Sup

Runway East >>

10 Finsbury Square

London

>> EC2A 1AF

On Mon, Feb 1, 2016 at 2:43 PM, Patrick Gili via swift-evolution <swift-evolution@swift.org <mailto:swift-evolution@swift.org>> wrote:
Hi Dany,

My response is inline below.

Cheers,
-Patrick

On Jan 31, 2016, at 8:56 PM, Dany St-Amant <dsa.mls@icloud.com <mailto:dsa.mls@icloud.com>> wrote:

Le 31 janv. 2016 à 16:46, Patrick Gili <gili.patrick.r@gili-labs.com <mailto:gili.patrick.r@gili-labs.com>> a écrit :

Hi Dany,

Please find my response inline below.

Cheers,
-Patrick

On Jan 31, 2016, at 3:46 PM, Dany St-Amant via swift-evolution <swift-evolution@swift.org <mailto:swift-evolution@swift.org>> wrote:

This seem to be two proposals in one:
1. Initialize NSRegularExpression with a single String which includes options

The ultimate goal based on the earlier mail in the thread seems to be able in a future proposal do thing like: string ~= replacePattern, if string =~ pattern, decoupled from the legacy Obj-C. Isn’t NSRegularExpression part of the legacy? The conversion of the literal string as regular expression should probably part of the proposal for these operators; as this is the time we will know how we want the text to be interpreted.

I don't see any evidence of NSRegularExpression becoming part of any legacy. Given SE-005, SE-006, and SE-023, the name is probably changing from NSRegularExpression to RegularExpression. However, I don't think the definition of the class will change, only the name.

I would like to see an operator regular expression matching operator, like Ruby and Perl. I was trying to keep the proposal a minimal increment that would buy the biggest bang for the buck. We can already accomplish much of what other languages can do with regard to regular expression. However, the notion of a regular expression isn't something we can work around with custom library today. Can you suggest something addition that should be in the proposal?

Splitting proposal in smaller ones have its advantage, but here I am just wondering if we are sure that these future operation will use the NSRegularExpression/RegularExpression. And does the currently selected syntax allow for future expansion, it would be bad to introduce something that need to be torn away or changed in an incompatible way, once we really start to use them in their final location.

The proposal is focused on the search, but seem to skip the substitution; I am unable to see an option to replace all matches instead of the first one only in the proposal. I, as many other, would expect regular expression in a language to also support substitution.

As for addition to the proposal, the processing of the string could be support for any character (within some limit) for the slash delimiter. With sed, when replacing path component, one can do: echo $PWD | sed -e "s:^/usr/local/bin:/opt/share/bin:g", instead of escaping every single slashes. Which is really handy to make thing easier to read.

Also, putting aside that I think \(scheme) should not be interpreted in the example, with a syntax allowing such interpretation the variable should be processed to generate proper escaping. If one is to use \(filename) you get "main.c", but one must use \(filename.escaped()) to get the proper "main\.c" to avoid matching "mainac". The String.escaped() must be in a format compatible with the format used when converting the regular expression into NSRegularExpression (not sure if the two syntax are the same; I think that at least the handling of / may differ)

I agree. Perhaps I went too far with keeping the proposal short-and-sweet. Especially when you consider the rich syntax that Perl supports for substitution.

2. Easily create a String without escaping (\n is not linefeed, but \ and n)

The ability to not interpret the backslash as escape can be useful in other scenario that creating a NSRegularExpression; like creating a Windows pathname, or creating regular expression which are then given to external tool. So this part of the proposal should probably be generalized.

Generalize it for what? If you're thinking along the line of raw strings, I agree that we need this capability, as well as multi-line string literals. However, I just soon we have separate proposals for this.

My point/opinion here, is that a regular expressions are just a String which are then interpreted; the same way as "Good Morning", "Bonjour", or "Marhaba" (even when using the arabic script) are just String when you assign then to a variable in Swift, and then interpreted by the intended user. They are not String, frenchString, rigthToLeftString. So I do not see why a regular expression should have privileged treatment and have its own language level syntax. The only difference when writing regular expression, or Windows pathname, or any String with a syntax with heavily uses of backslashes, is that one may want to disable the special meaning of the backslashes, to make thing more readable.

On the page of geeky-ing the String there’s four main part IMHO
- multi-line support
- no backslash escaping version (which should include no processing the \(variable) format)
- inclusion of String delimiter inside the String
- concat of backslash/no backslash version. Bash example echo 'echo "$BASH" shows '"$BASH"

I’m still trying to find back the mail thread crumbs on these topics, since before restarting the discussion in these topics, the previous one should be properly summarized; unless such summary already exist.

I think supporting interpolation is important. Both Perl and Ruby support it, and I'm sure there are other languages. One thing I forgot to put into the proposal: an option to disable interpolation or limit it to single pass.

Looking ahead at the other responses, Chris Lattner has suggested that the proposal would have more traction if we can find a way to fold this into Swift's pattern matching. I can't say as I disagree, as this makes regular expression more Swifty.

Regards,
Dany

Dany

Le 31 janv. 2016 à 12:18, Patrick Gili via swift-evolution <swift-evolution@swift.org <mailto:swift-evolution@swift.org>> a écrit :

Here is the link to the proposal on GitHub:

https://github.com/gili-patrick-r/swift-evolution/blob/master/proposals/NNNN-regular-expression-literals.md

Cheers,
-Patrick

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org <mailto:swift-evolution@swift.org>
https://lists.swift.org/mailman/listinfo/swift-evolution

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org <mailto:swift-evolution@swift.org>
https://lists.swift.org/mailman/listinfo/swift-evolution

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org <mailto:swift-evolution@swift.org>
https://lists.swift.org/mailman/listinfo/swift-evolution

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org <mailto:swift-evolution@swift.org>
https://lists.swift.org/mailman/listinfo/swift-evolution

--
  -- Howard.
_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution

I really like this idea. It would be very nice to be able to describe a regex like “(name:String), (dateOfBirth: NSDate), (notes: String), (EOL)” and have the details of what characters constitute each of these types be specified on a per-type basis. NSScanner can do something a bit like this, but it requires quite a bit of care and feeding.

-jcr

···

On Feb 2, 2016, at 9:29 PM, Chris Lattner via swift-evolution <swift-evolution@swift.org> wrote:

If you bring the same concept to regex parsing, I think it would make a lot of sense for primitive types to support “default” regex rules (e.g. integers would default to /[0-9]+/ ) and then have modifier characters that support other standard modes for them (e.g. x for hexadecimal).

The difference is that I am proposing supporting both verbal expressions
and regex literals and that - literals are converted to verbals and the
processing happens at the verbal level. The reason for this is that verbals
are easy to handle programmatically whilst literals are great for quickly
specifying a regex.

···

On Tuesday, 2 February 2016, Patrick Gili <gili.patrick.r@gili-labs.com> wrote:

Hi Howard,

I don't see how this is very different from the Swift Verbal Expressions.
It would suffer from the same disadvantages I have stated previously.

Cheers,
-Patrick

On Feb 2, 2016, at 1:51 AM, Howard Lovatt via swift-evolution < > swift-evolution@swift.org > <javascript:_e(%7B%7D,'cvml','swift-evolution@swift.org');>> wrote:

Others have suggested a programatic regex instead of a regex literal, how
about doing both? Something like:

enum RegexElement {
    case capture(name: String, value: String)
    case special(Special)
    // ...
    enum Special: String {
        case startOfLine = "^"
        // ...
        case endOfLine = "$"
    }
}

// Define a regexLiteral syntax that the compiler understands that is of
type Regex and consists of String representations of RegexElements, e.g.
using forward slash:
// /<RegexElements>*/

struct Regex: CustomStringConvertible { // Compiled, immutable, thread
safe, and bridged to NSRegularExpression
    // ... internal compiled representation
    let elements: [RegexElement]
    var description: String {
        return RegexElement.Special.startOfLine.rawValue // Example.
Really returns all the elements converted back to a string
    }
    init(_ elements: RegexElement...) {
        self.elements = elements // Example. Really also compiles the
expression
    }
    // init(regexLiteral regex: Regex) {
    // init(concatAll regexes: Regex...) {
    // init(fromString string: String) {
    // ... more inits
    func map<T>(input: String, @noescape mapper: (element: RegexElement)
throws -> T) rethrows -> [T] {
        return [try mapper(element: RegexElement.special(.startOfLine))] //
Example. Really does the matching
    }
    // func flatMap<T>(input: String, @noescape mapper: (element:
RegexElement) throws -> T?) rethrows -> [T] {
    // func flatMap<S: SequenceType>(input: String, @noescape mapper:
(element: RegexElement) throws -> S) rethrows -> [S.Generator.Element] {
    // func forEach(input: String, @noescape eacher: (element:
RegexElement) throws -> Void) rethrows {
    // ... more funcs
}

let regex = Regex(RegexElement.special(.startOfLine)) // Normally a regex
literal
let asStringArray = regex.map("Example") { element -> String in //
Returns `["^"]` in example
    switch element {
    case let .capture(_, v): return v
    case let .special(s): return s.rawValue
    }
}

The advantages are:

   1. We get a literal type for convenience.
   2. We get a programatic type when we need to manipulate regexes.
   3. Breaking the regex matches into the enum defined elements of the
   regex works well with Swift pattern matching.

(Above is a very rough sketch!)

On 2 February 2016 at 16:44, Thorsten Seitz via swift-evolution < > swift-evolution@swift.org > <javascript:_e(%7B%7D,'cvml','swift-evolution@swift.org');>> wrote:

Something like Scala's extractors or F#'s Active Patterns would be most
welcome to generalize pattern matching.

Redirecting…
F Sharp Programming/Active Patterns - Wikibooks, open books for an open world

-Thorsten

Am 01.02.2016 um 15:46 schrieb James Campbell via swift-evolution < >> swift-evolution@swift.org
<javascript:_e(%7B%7D,'cvml','swift-evolution@swift.org');>>:

It would be great if we could create a generic way of making this swifty.
You may let say want to implement a matching system for structure like JSON
or XML (i.e XQuery).

*___________________________________*

*James⎥Lead Engineer*

*james@supmenow.com
<javascript:_e(%7B%7D,'cvml','james@supmenow.com');>⎥supmenow.com
<http://supmenow.com/&gt;\*

*Sup*

*Runway East *

*10 Finsbury Square*

*London*

* EC2A 1AF *

On Mon, Feb 1, 2016 at 2:43 PM, Patrick Gili via swift-evolution < >> swift-evolution@swift.org >> <javascript:_e(%7B%7D,'cvml','swift-evolution@swift.org');>> wrote:

Hi Dany,

My response is inline below.

Cheers,
-Patrick

On Jan 31, 2016, at 8:56 PM, Dany St-Amant <dsa.mls@icloud.com >>> <javascript:_e(%7B%7D,'cvml','dsa.mls@icloud.com');>> wrote:

Le 31 janv. 2016 à 16:46, Patrick Gili <gili.patrick.r@gili-labs.com >>> <javascript:_e(%7B%7D,'cvml','gili.patrick.r@gili-labs.com');>> a écrit
:

Hi Dany,

Please find my response inline below.

Cheers,
-Patrick

On Jan 31, 2016, at 3:46 PM, Dany St-Amant via swift-evolution < >>> swift-evolution@swift.org >>> <javascript:_e(%7B%7D,'cvml','swift-evolution@swift.org');>> wrote:

This seem to be two proposals in one:
1. Initialize NSRegularExpression with a single String which includes
options

The ultimate goal based on the earlier mail in the thread seems to be
able in a future proposal do thing like: string ~= replacePattern, if
string =~ pattern, decoupled from the legacy Obj-C. Isn’t
NSRegularExpression part of the legacy? The conversion of the literal
string as regular expression should probably part of the proposal for these
operators; as this is the time we will know how we want the text to be
interpreted.

I don't see any evidence of NSRegularExpression becoming part of any
legacy. Given SE-005, SE-006, and SE-023, the name is probably changing
from NSRegularExpression to RegularExpression. However, I don't think the
definition of the class will change, only the name.

I would like to see an operator regular expression matching operator,
like Ruby and Perl. I was trying to keep the proposal a minimal increment
that would buy the biggest bang for the buck. We can already accomplish
much of what other languages can do with regard to regular expression.
However, the notion of a regular expression isn't something we can work
around with custom library today. Can you suggest something addition that
should be in the proposal?

Splitting proposal in smaller ones have its advantage, but here I am
just wondering if we are sure that these future operation will use the
NSRegularExpression/RegularExpression. And does the currently selected
syntax allow for future expansion, it would be bad to introduce something
that need to be torn away or changed in an incompatible way, once we
really start to use them in their final location.

The proposal is focused on the search, but seem to skip the
substitution; I am unable to see an option to replace all matches instead
of the first one only in the proposal. I, as many other, would expect
regular expression in a language to also support substitution.

As for addition to the proposal, the processing of the string could be
support for any character (within some limit) for the slash delimiter. With
sed, when replacing path component, one can do: echo $PWD | sed -e
"s:^/usr/local/bin:/opt/share/bin:g", instead of escaping every single
slashes. Which is really handy to make thing easier to read.

Also, putting aside that I think \(scheme) should not be interpreted in
the example, with a syntax allowing such interpretation the variable should
be processed to generate proper escaping. If one is to use \(filename) you
get "main.c", but one must use \(filename.escaped()) to get the proper
"main\.c" to avoid matching "mainac". The String.escaped() must be in a
format compatible with the format used when converting the regular
expression into NSRegularExpression (not sure if the two syntax are the
same; I think that at least the handling of / may differ)

I agree. Perhaps I went too far with keeping the proposal
short-and-sweet. Especially when you consider the rich syntax that Perl
supports for substitution.

2. Easily create a String without escaping (\n is not linefeed, but \
and n)

The ability to not interpret the backslash as escape can be useful in
other scenario that creating a NSRegularExpression; like creating a Windows
pathname, or creating regular expression which are then given to external
tool. So this part of the proposal should probably be generalized.

Generalize it for what? If you're thinking along the line of raw
strings, I agree that we need this capability, as well as multi-line string
literals. However, I just soon we have separate proposals for this.

My point/opinion here, is that a regular expressions are just a String
which are then interpreted; the same way as "Good Morning", "Bonjour", or
"Marhaba" (even when using the arabic script) are just String when you
assign then to a variable in Swift, and then interpreted by the intended
user. They are not String, frenchString, rigthToLeftString. So I do not see
why a regular expression should have privileged treatment and have its own
language level syntax. The only difference when writing regular expression,
or Windows pathname, or any String with a syntax with heavily uses of
backslashes, is that one may want to disable the special meaning of the
backslashes, to make thing more readable.

On the page of geeky-ing the String there’s four main part IMHO
- multi-line support
- no backslash escaping version (which should include no processing the
\(variable) format)
- inclusion of String delimiter inside the String
- concat of backslash/no backslash version. Bash example echo 'echo
"$BASH" shows '"$BASH"

I’m still trying to find back the mail thread crumbs on these topics,
since before restarting the discussion in these topics, the previous one
should be properly summarized; unless such summary already exist.

I think supporting interpolation is important. Both Perl and Ruby
support it, and I'm sure there are other languages. One thing I forgot to
put into the proposal: an option to disable interpolation or limit it to
single pass.

Looking ahead at the other responses, Chris Lattner has suggested that
the proposal would have more traction if we can find a way to fold this
into Swift's pattern matching. I can't say as I disagree, as this makes
regular expression more Swifty.

Regards,
Dany

Dany

Le 31 janv. 2016 à 12:18, Patrick Gili via swift-evolution < >>> swift-evolution@swift.org >>> <javascript:_e(%7B%7D,'cvml','swift-evolution@swift.org');>> a écrit :

Here is the link to the proposal on GitHub:

https://github.com/gili-patrick-r/swift-evolution/blob/master/proposals/NNNN-regular-expression-literals.md

Cheers,
-Patrick

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
<javascript:_e(%7B%7D,'cvml','swift-evolution@swift.org');>
https://lists.swift.org/mailman/listinfo/swift-evolution

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
<javascript:_e(%7B%7D,'cvml','swift-evolution@swift.org');>
https://lists.swift.org/mailman/listinfo/swift-evolution

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
<javascript:_e(%7B%7D,'cvml','swift-evolution@swift.org');>
https://lists.swift.org/mailman/listinfo/swift-evolution

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
<javascript:_e(%7B%7D,'cvml','swift-evolution@swift.org');>
https://lists.swift.org/mailman/listinfo/swift-evolution

--
  -- Howard.
_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
<javascript:_e(%7B%7D,'cvml','swift-evolution@swift.org');>
https://lists.swift.org/mailman/listinfo/swift-evolution

--
  -- Howard.

Hi Chris,

I have been reading the documentation and design notes for regular expressions in Perl6. Thank you for the pointer to the documentation. I really like what the Perl6 community has done with regular expressions (I like the notion of grammars too).

Given the insights you shared below, I imagine we can address your gripe with regular expressions by implementing something like Perl6's named regular expressions, which easily facilitates reuse.

To further encourage reuse, we could introduce the notion of "standardized parsing" of data types, much in the same way that Swift already supports the notion of "standardized formatted printing".

To achieve this, we could introduce a protocol, similar to CustomStringConvertible (or maybe extend CustomStringConvertible) that would not define a method to convert a string to the data type, but define the regular expression that facilitates the conversion. The protocol could also facilitate the reuse of the regular expression in other regular expression literals.

Cheers,
-Patrick

···

Sent from my iPad Pro

On Feb 3, 2016, at 12:29 AM, Chris Lattner via swift-evolution <swift-evolution@swift.org> wrote:

On Feb 1, 2016, at 9:44 PM, Thorsten Seitz via swift-evolution <swift-evolution@swift.org> wrote:

Something like Scala's extractors or F#'s Active Patterns would be most welcome to generalize pattern matching.

Redirecting…
F Sharp Programming/Active Patterns - Wikibooks, open books for an open world

This is pretty interesting, and I agree with the general observation here: there is a duality between printing and parsing that should be modeled and exploited.

One of the things that we discussed in the time leading up to Swift 1 (but then pushed off and never came back to) was model for doing standardized formatted printing - capturing the power and expressiveness of printf, without the problems it entails. A sketch of the proposal is here:
https://github.com/apple/swift/blob/master/docs/TextFormatting.rst

The pertinent idea here is that types can have a default printing rule (e.g. integers print in decimal) but then can opt into providing a more powerful formatting model by supporting modifier characters (for example “x” for an integer could mean “hexadecimal”). This would be protocol based, so arbitrary types could participate.

If you bring the same concept to regex parsing, I think it would make a lot of sense for primitive types to support “default” regex rules (e.g. integers would default to /[0-9]+/ ) and then have modifier characters that support other standard modes for them (e.g. x for hexadecimal). This would obviously want to be extensible to arbitrary types, so that (e.g.) NSDate could support the format families that make sense.

Going with this would address one of my major gripes with regex’s used in practice, which is that they are a huge violation of the DRY principle, and that reuse of regex patterns almost never happens.

-Chris

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution

I don't feel good about this direction for the following reasons:
1) Complexity
2) Maturity? I don't know how Verbal Expressions has been implemented. Does it leverage mature regex open source? Or, has it been written from scratch?
3) Performance? Compiling a regex literal typically results in a FSM of a sort, optimized to parse strings. I wouldn't think that converting a regex literal to Verbal Expressions would yield great performance every time a match or substitution is done.

-Patrick

···

On Feb 2, 2016, at 5:55 PM, Howard Lovatt <howard.lovatt@gmail.com> wrote:

The difference is that I am proposing supporting both verbal expressions and regex literals and that - literals are converted to verbals and the processing happens at the verbal level. The reason for this is that verbals are easy to handle programmatically whilst literals are great for quickly specifying a regex.

On Tuesday, 2 February 2016, Patrick Gili <gili.patrick.r@gili-labs.com <mailto:gili.patrick.r@gili-labs.com>> wrote:
Hi Howard,

I don't see how this is very different from the Swift Verbal Expressions. It would suffer from the same disadvantages I have stated previously.

Cheers,
-Patrick

On Feb 2, 2016, at 1:51 AM, Howard Lovatt via swift-evolution <swift-evolution@swift.org <javascript:_e(%7B%7D,'cvml','swift-evolution@swift.org');>> wrote:

Others have suggested a programatic regex instead of a regex literal, how about doing both? Something like:

enum RegexElement {
    case capture(name: String, value: String)
    case special(Special)
    // ...
    enum Special: String {
        case startOfLine = "^"
        // ...
        case endOfLine = "$"
    }
}

// Define a regexLiteral syntax that the compiler understands that is of type Regex and consists of String representations of RegexElements, e.g. using forward slash:
// /<RegexElements>*/

struct Regex: CustomStringConvertible { // Compiled, immutable, thread safe, and bridged to NSRegularExpression
    // ... internal compiled representation
    let elements: [RegexElement]
    var description: String {
        return RegexElement.Special.startOfLine.rawValue // Example. Really returns all the elements converted back to a string
    }
    init(_ elements: RegexElement...) {
        self.elements = elements // Example. Really also compiles the expression
    }
    // init(regexLiteral regex: Regex) {
    // init(concatAll regexes: Regex...) {
    // init(fromString string: String) {
    // ... more inits
    func map<T>(input: String, @noescape mapper: (element: RegexElement) throws -> T) rethrows -> [T] {
        return [try mapper(element: RegexElement.special(.startOfLine))] // Example. Really does the matching
    }
    // func flatMap<T>(input: String, @noescape mapper: (element: RegexElement) throws -> T?) rethrows -> [T] {
    // func flatMap<S: SequenceType>(input: String, @noescape mapper: (element: RegexElement) throws -> S) rethrows -> [S.Generator.Element] {
    // func forEach(input: String, @noescape eacher: (element: RegexElement) throws -> Void) rethrows {
    // ... more funcs
}

let regex = Regex(RegexElement.special(.startOfLine)) // Normally a regex literal
let asStringArray = regex.map("Example") { element -> String in // Returns `["^"]` in example
    switch element {
    case let .capture(_, v): return v
    case let .special(s): return s.rawValue
    }
}

The advantages are:
  We get a literal type for convenience.
  We get a programatic type when we need to manipulate regexes.
  Breaking the regex matches into the enum defined elements of the regex works well with Swift pattern matching.
(Above is a very rough sketch!)

On 2 February 2016 at 16:44, Thorsten Seitz via swift-evolution <swift-evolution@swift.org <javascript:_e(%7B%7D,'cvml','swift-evolution@swift.org');>> wrote:
Something like Scala's extractors or F#'s Active Patterns would be most welcome to generalize pattern matching.

Redirecting…
F Sharp Programming/Active Patterns - Wikibooks, open books for an open world

-Thorsten

Am 01.02.2016 um 15:46 schrieb James Campbell via swift-evolution <swift-evolution@swift.org <javascript:_e(%7B%7D,'cvml','swift-evolution@swift.org');>>:

It would be great if we could create a generic way of making this swifty. You may let say want to implement a matching system for structure like JSON or XML (i.e XQuery).

___________________________________

James⎥Lead Engineer

james@supmenow.com <javascript:_e(%7B%7D,'cvml','james@supmenow.com');>⎥supmenow.com <http://supmenow.com/&gt;
Sup

Runway East >>>

10 Finsbury Square

London

>>> EC2A 1AF

On Mon, Feb 1, 2016 at 2:43 PM, Patrick Gili via swift-evolution <swift-evolution@swift.org <javascript:_e(%7B%7D,'cvml','swift-evolution@swift.org');>> wrote:
Hi Dany,

My response is inline below.

Cheers,
-Patrick

On Jan 31, 2016, at 8:56 PM, Dany St-Amant <dsa.mls@icloud.com <javascript:_e(%7B%7D,'cvml','dsa.mls@icloud.com');>> wrote:

Le 31 janv. 2016 à 16:46, Patrick Gili <gili.patrick.r@gili-labs.com <javascript:_e(%7B%7D,'cvml','gili.patrick.r@gili-labs.com');>> a écrit :

Hi Dany,

Please find my response inline below.

Cheers,
-Patrick

On Jan 31, 2016, at 3:46 PM, Dany St-Amant via swift-evolution <swift-evolution@swift.org <javascript:_e(%7B%7D,'cvml','swift-evolution@swift.org');>> wrote:

This seem to be two proposals in one:
1. Initialize NSRegularExpression with a single String which includes options

The ultimate goal based on the earlier mail in the thread seems to be able in a future proposal do thing like: string ~= replacePattern, if string =~ pattern, decoupled from the legacy Obj-C. Isn’t NSRegularExpression part of the legacy? The conversion of the literal string as regular expression should probably part of the proposal for these operators; as this is the time we will know how we want the text to be interpreted.

I don't see any evidence of NSRegularExpression becoming part of any legacy. Given SE-005, SE-006, and SE-023, the name is probably changing from NSRegularExpression to RegularExpression. However, I don't think the definition of the class will change, only the name.

I would like to see an operator regular expression matching operator, like Ruby and Perl. I was trying to keep the proposal a minimal increment that would buy the biggest bang for the buck. We can already accomplish much of what other languages can do with regard to regular expression. However, the notion of a regular expression isn't something we can work around with custom library today. Can you suggest something addition that should be in the proposal?

Splitting proposal in smaller ones have its advantage, but here I am just wondering if we are sure that these future operation will use the NSRegularExpression/RegularExpression. And does the currently selected syntax allow for future expansion, it would be bad to introduce something that need to be torn away or changed in an incompatible way, once we really start to use them in their final location.

The proposal is focused on the search, but seem to skip the substitution; I am unable to see an option to replace all matches instead of the first one only in the proposal. I, as many other, would expect regular expression in a language to also support substitution.

As for addition to the proposal, the processing of the string could be support for any character (within some limit) for the slash delimiter. With sed, when replacing path component, one can do: echo $PWD | sed -e "s:^/usr/local/bin:/opt/share/bin:g", instead of escaping every single slashes. Which is really handy to make thing easier to read.

Also, putting aside that I think \(scheme) should not be interpreted in the example, with a syntax allowing such interpretation the variable should be processed to generate proper escaping. If one is to use \(filename) you get "main.c", but one must use \(filename.escaped()) to get the proper "main\.c" to avoid matching "mainac". The String.escaped() must be in a format compatible with the format used when converting the regular expression into NSRegularExpression (not sure if the two syntax are the same; I think that at least the handling of / may differ)

I agree. Perhaps I went too far with keeping the proposal short-and-sweet. Especially when you consider the rich syntax that Perl supports for substitution.

2. Easily create a String without escaping (\n is not linefeed, but \ and n)

The ability to not interpret the backslash as escape can be useful in other scenario that creating a NSRegularExpression; like creating a Windows pathname, or creating regular expression which are then given to external tool. So this part of the proposal should probably be generalized.

Generalize it for what? If you're thinking along the line of raw strings, I agree that we need this capability, as well as multi-line string literals. However, I just soon we have separate proposals for this.

My point/opinion here, is that a regular expressions are just a String which are then interpreted; the same way as "Good Morning", "Bonjour", or "Marhaba" (even when using the arabic script) are just String when you assign then to a variable in Swift, and then interpreted by the intended user. They are not String, frenchString, rigthToLeftString. So I do not see why a regular expression should have privileged treatment and have its own language level syntax. The only difference when writing regular expression, or Windows pathname, or any String with a syntax with heavily uses of backslashes, is that one may want to disable the special meaning of the backslashes, to make thing more readable.

On the page of geeky-ing the String there’s four main part IMHO
- multi-line support
- no backslash escaping version (which should include no processing the \(variable) format)
- inclusion of String delimiter inside the String
- concat of backslash/no backslash version. Bash example echo 'echo "$BASH" shows '"$BASH"

I’m still trying to find back the mail thread crumbs on these topics, since before restarting the discussion in these topics, the previous one should be properly summarized; unless such summary already exist.

I think supporting interpolation is important. Both Perl and Ruby support it, and I'm sure there are other languages. One thing I forgot to put into the proposal: an option to disable interpolation or limit it to single pass.

Looking ahead at the other responses, Chris Lattner has suggested that the proposal would have more traction if we can find a way to fold this into Swift's pattern matching. I can't say as I disagree, as this makes regular expression more Swifty.

Regards,
Dany

Dany

Le 31 janv. 2016 à 12:18, Patrick Gili via swift-evolution <swift-evolution@swift.org <javascript:_e(%7B%7D,'cvml','swift-evolution@swift.org');>> a écrit :

Here is the link to the proposal on GitHub:

https://github.com/gili-patrick-r/swift-evolution/blob/master/proposals/NNNN-regular-expression-literals.md

Cheers,
-Patrick

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org <javascript:_e(%7B%7D,'cvml','swift-evolution@swift.org');>
https://lists.swift.org/mailman/listinfo/swift-evolution

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org <javascript:_e(%7B%7D,'cvml','swift-evolution@swift.org');>
https://lists.swift.org/mailman/listinfo/swift-evolution

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org <javascript:_e(%7B%7D,'cvml','swift-evolution@swift.org');>
https://lists.swift.org/mailman/listinfo/swift-evolution

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org <javascript:_e(%7B%7D,'cvml','swift-evolution@swift.org');>
https://lists.swift.org/mailman/listinfo/swift-evolution

--
  -- Howard.
_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org <javascript:_e(%7B%7D,'cvml','swift-evolution@swift.org');>
https://lists.swift.org/mailman/listinfo/swift-evolution

--
  -- Howard.

I don't see that the two have to be exclusive. If the design of the regex
literal is suitable for both a traditional NSRegularExpression and a verbal
type implementation then the two can co-exist. It can also be staged, so
that a literal can be introduced first with a bridge to legacy
NSRegularExpression and then later a verbal implementation could be added.
The key is to design a liberal that is future proofed.

···

On 3 February 2016 at 10:33, Patrick Gili <gili.patrick.r@gili-labs.com> wrote:

I don't feel good about this direction for the following reasons:
1) Complexity
2) Maturity? I don't know how Verbal Expressions has been implemented.
Does it leverage mature regex open source? Or, has it been written from
scratch?
3) Performance? Compiling a regex literal typically results in a FSM of a
sort, optimized to parse strings. I wouldn't think that converting a regex
literal to Verbal Expressions would yield great performance every time a
match or substitution is done.

-Patrick

On Feb 2, 2016, at 5:55 PM, Howard Lovatt <howard.lovatt@gmail.com> wrote:

The difference is that I am proposing supporting both verbal expressions
and regex literals and that - literals are converted to verbals and the
processing happens at the verbal level. The reason for this is that verbals
are easy to handle programmatically whilst literals are great for quickly
specifying a regex.

On Tuesday, 2 February 2016, Patrick Gili <gili.patrick.r@gili-labs.com> > wrote:

Hi Howard,

I don't see how this is very different from the Swift Verbal Expressions.
It would suffer from the same disadvantages I have stated previously.

Cheers,
-Patrick

On Feb 2, 2016, at 1:51 AM, Howard Lovatt via swift-evolution < >> swift-evolution@swift.org> wrote:

Others have suggested a programatic regex instead of a regex literal, how
about doing both? Something like:

enum RegexElement {
    case capture(name: String, value: String)
    case special(Special)
    // ...
    enum Special: String {
        case startOfLine = "^"
        // ...
        case endOfLine = "$"
    }
}

// Define a regexLiteral syntax that the compiler understands that is of
type Regex and consists of String representations of RegexElements, e.g.
using forward slash:
// /<RegexElements>*/

struct Regex: CustomStringConvertible { // Compiled, immutable, thread
safe, and bridged to NSRegularExpression
    // ... internal compiled representation
    let elements: [RegexElement]
    var description: String {
        return RegexElement.Special.startOfLine.rawValue // Example.
Really returns all the elements converted back to a string
    }
    init(_ elements: RegexElement...) {
        self.elements = elements // Example. Really also compiles the
expression
    }
    // init(regexLiteral regex: Regex) {
    // init(concatAll regexes: Regex...) {
    // init(fromString string: String) {
    // ... more inits
    func map<T>(input: String, @noescape mapper: (element: RegexElement)
throws -> T) rethrows -> [T] {
        return [try mapper(element: RegexElement.special(.startOfLine))] //
Example. Really does the matching
    }
    // func flatMap<T>(input: String, @noescape mapper: (element:
RegexElement) throws -> T?) rethrows -> [T] {
    // func flatMap<S: SequenceType>(input: String, @noescape mapper:
(element: RegexElement) throws -> S) rethrows -> [S.Generator.Element] {
    // func forEach(input: String, @noescape eacher: (element:
RegexElement) throws -> Void) rethrows {
    // ... more funcs
}

let regex = Regex(RegexElement.special(.startOfLine)) // Normally a
regex literal
let asStringArray = regex.map("Example") { element -> String in //
Returns `["^"]` in example
    switch element {
    case let .capture(_, v): return v
    case let .special(s): return s.rawValue
    }
}

The advantages are:

   1. We get a literal type for convenience.
   2. We get a programatic type when we need to manipulate regexes.
   3. Breaking the regex matches into the enum defined elements of the
   regex works well with Swift pattern matching.

(Above is a very rough sketch!)

On 2 February 2016 at 16:44, Thorsten Seitz via swift-evolution < >> swift-evolution@swift.org> wrote:

Something like Scala's extractors or F#'s Active Patterns would be most
welcome to generalize pattern matching.

Redirecting…
F Sharp Programming/Active Patterns - Wikibooks, open books for an open world

-Thorsten

Am 01.02.2016 um 15:46 schrieb James Campbell via swift-evolution < >>> swift-evolution@swift.org>:

It would be great if we could create a generic way of making this
swifty. You may let say want to implement a matching system for structure
like JSON or XML (i.e XQuery).

*___________________________________*

*James⎥Lead Engineer*

*james@supmenow.com⎥supmenow.com <http://supmenow.com/&gt;\*

*Sup*

*Runway East *

*10 Finsbury Square*

*London*

* EC2A 1AF *

On Mon, Feb 1, 2016 at 2:43 PM, Patrick Gili via swift-evolution < >>> swift-evolution@swift.org> wrote:

Hi Dany,

My response is inline below.

Cheers,
-Patrick

On Jan 31, 2016, at 8:56 PM, Dany St-Amant <dsa.mls@icloud.com> wrote:

Le 31 janv. 2016 à 16:46, Patrick Gili <gili.patrick.r@gili-labs.com> >>>> a écrit :

Hi Dany,

Please find my response inline below.

Cheers,
-Patrick

On Jan 31, 2016, at 3:46 PM, Dany St-Amant via swift-evolution < >>>> swift-evolution@swift.org> wrote:

This seem to be two proposals in one:
1. Initialize NSRegularExpression with a single String which includes
options

The ultimate goal based on the earlier mail in the thread seems to be
able in a future proposal do thing like: string ~= replacePattern, if
string =~ pattern, decoupled from the legacy Obj-C. Isn’t
NSRegularExpression part of the legacy? The conversion of the literal
string as regular expression should probably part of the proposal for these
operators; as this is the time we will know how we want the text to be
interpreted.

I don't see any evidence of NSRegularExpression becoming part of any
legacy. Given SE-005, SE-006, and SE-023, the name is probably changing
from NSRegularExpression to RegularExpression. However, I don't think the
definition of the class will change, only the name.

I would like to see an operator regular expression matching operator,
like Ruby and Perl. I was trying to keep the proposal a minimal increment
that would buy the biggest bang for the buck. We can already accomplish
much of what other languages can do with regard to regular expression.
However, the notion of a regular expression isn't something we can work
around with custom library today. Can you suggest something addition that
should be in the proposal?

Splitting proposal in smaller ones have its advantage, but here I am
just wondering if we are sure that these future operation will use the
NSRegularExpression/RegularExpression. And does the currently selected
syntax allow for future expansion, it would be bad to introduce something
that need to be torn away or changed in an incompatible way, once we
really start to use them in their final location.

The proposal is focused on the search, but seem to skip the
substitution; I am unable to see an option to replace all matches instead
of the first one only in the proposal. I, as many other, would expect
regular expression in a language to also support substitution.

As for addition to the proposal, the processing of the string could be
support for any character (within some limit) for the slash delimiter. With
sed, when replacing path component, one can do: echo $PWD | sed -e
"s:^/usr/local/bin:/opt/share/bin:g", instead of escaping every single
slashes. Which is really handy to make thing easier to read.

Also, putting aside that I think \(scheme) should not be interpreted in
the example, with a syntax allowing such interpretation the variable should
be processed to generate proper escaping. If one is to use \(filename) you
get "main.c", but one must use \(filename.escaped()) to get the proper
"main\.c" to avoid matching "mainac". The String.escaped() must be in a
format compatible with the format used when converting the regular
expression into NSRegularExpression (not sure if the two syntax are the
same; I think that at least the handling of / may differ)

I agree. Perhaps I went too far with keeping the proposal
short-and-sweet. Especially when you consider the rich syntax that Perl
supports for substitution.

2. Easily create a String without escaping (\n is not linefeed, but \
and n)

The ability to not interpret the backslash as escape can be useful in
other scenario that creating a NSRegularExpression; like creating a Windows
pathname, or creating regular expression which are then given to external
tool. So this part of the proposal should probably be generalized.

Generalize it for what? If you're thinking along the line of raw
strings, I agree that we need this capability, as well as multi-line string
literals. However, I just soon we have separate proposals for this.

My point/opinion here, is that a regular expressions are just a String
which are then interpreted; the same way as "Good Morning", "Bonjour", or
"Marhaba" (even when using the arabic script) are just String when you
assign then to a variable in Swift, and then interpreted by the intended
user. They are not String, frenchString, rigthToLeftString. So I do not see
why a regular expression should have privileged treatment and have its own
language level syntax. The only difference when writing regular expression,
or Windows pathname, or any String with a syntax with heavily uses of
backslashes, is that one may want to disable the special meaning of the
backslashes, to make thing more readable.

On the page of geeky-ing the String there’s four main part IMHO
- multi-line support
- no backslash escaping version (which should include no processing the
\(variable) format)
- inclusion of String delimiter inside the String
- concat of backslash/no backslash version. Bash example echo 'echo
"$BASH" shows '"$BASH"

I’m still trying to find back the mail thread crumbs on these topics,
since before restarting the discussion in these topics, the previous one
should be properly summarized; unless such summary already exist.

I think supporting interpolation is important. Both Perl and Ruby
support it, and I'm sure there are other languages. One thing I forgot to
put into the proposal: an option to disable interpolation or limit it to
single pass.

Looking ahead at the other responses, Chris Lattner has suggested that
the proposal would have more traction if we can find a way to fold this
into Swift's pattern matching. I can't say as I disagree, as this makes
regular expression more Swifty.

Regards,
Dany

Dany

Le 31 janv. 2016 à 12:18, Patrick Gili via swift-evolution < >>>> swift-evolution@swift.org> a écrit :

Here is the link to the proposal on GitHub:

https://github.com/gili-patrick-r/swift-evolution/blob/master/proposals/NNNN-regular-expression-literals.md

Cheers,
-Patrick

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution

--
  -- Howard.
_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution

--
  -- Howard.

--
  -- Howard.

Assuming we’re still on track for getting an official package manager in Swift 3…

I vote we put the traditional regex in the stdlib, either as a non-final class or in combination with a “RegularExpressionParser” protocol, and make these “non-standard” parsers compatible with the stdlib implementation (that is, subclass the stdlib parser or conform to the protocol), and put them up in the package manager. “Swift Verbal Expressions" looks very cool, but I’m not sure it’s “lean & mean” enough for the stdlib.

- Dave Sweeris

···

On Feb 2, 2016, at 15:53, Howard Lovatt via swift-evolution <swift-evolution@swift.org> wrote:

I don't see that the two have to be exclusive. If the design of the regex literal is suitable for both a traditional NSRegularExpression and a verbal type implementation then the two can co-exist. It can also be staged, so that a literal can be introduced first with a bridge to legacy NSRegularExpression and then later a verbal implementation could be added. The key is to design a liberal that is future proofed.

On 3 February 2016 at 10:33, Patrick Gili <gili.patrick.r@gili-labs.com <mailto:gili.patrick.r@gili-labs.com>> wrote:
I don't feel good about this direction for the following reasons:
1) Complexity
2) Maturity? I don't know how Verbal Expressions has been implemented. Does it leverage mature regex open source? Or, has it been written from scratch?
3) Performance? Compiling a regex literal typically results in a FSM of a sort, optimized to parse strings. I wouldn't think that converting a regex literal to Verbal Expressions would yield great performance every time a match or substitution is done.

-Patrick

On Feb 2, 2016, at 5:55 PM, Howard Lovatt <howard.lovatt@gmail.com <mailto:howard.lovatt@gmail.com>> wrote:

The difference is that I am proposing supporting both verbal expressions and regex literals and that - literals are converted to verbals and the processing happens at the verbal level. The reason for this is that verbals are easy to handle programmatically whilst literals are great for quickly specifying a regex.

On Tuesday, 2 February 2016, Patrick Gili <gili.patrick.r@gili-labs.com <mailto:gili.patrick.r@gili-labs.com>> wrote:
Hi Howard,

I don't see how this is very different from the Swift Verbal Expressions. It would suffer from the same disadvantages I have stated previously.

Cheers,
-Patrick

On Feb 2, 2016, at 1:51 AM, Howard Lovatt via swift-evolution <swift-evolution@swift.org <>> wrote:

Others have suggested a programatic regex instead of a regex literal, how about doing both? Something like:

enum RegexElement {
    case capture(name: String, value: String)
    case special(Special)
    // ...
    enum Special: String {
        case startOfLine = "^"
        // ...
        case endOfLine = "$"
    }
}

// Define a regexLiteral syntax that the compiler understands that is of type Regex and consists of String representations of RegexElements, e.g. using forward slash:
// /<RegexElements>*/

struct Regex: CustomStringConvertible { // Compiled, immutable, thread safe, and bridged to NSRegularExpression
    // ... internal compiled representation
    let elements: [RegexElement]
    var description: String {
        return RegexElement.Special.startOfLine.rawValue // Example. Really returns all the elements converted back to a string
    }
    init(_ elements: RegexElement...) {
        self.elements = elements // Example. Really also compiles the expression
    }
    // init(regexLiteral regex: Regex) {
    // init(concatAll regexes: Regex...) {
    // init(fromString string: String) {
    // ... more inits
    func map<T>(input: String, @noescape mapper: (element: RegexElement) throws -> T) rethrows -> [T] {
        return [try mapper(element: RegexElement.special(.startOfLine))] // Example. Really does the matching
    }
    // func flatMap<T>(input: String, @noescape mapper: (element: RegexElement) throws -> T?) rethrows -> [T] {
    // func flatMap<S: SequenceType>(input: String, @noescape mapper: (element: RegexElement) throws -> S) rethrows -> [S.Generator.Element] {
    // func forEach(input: String, @noescape eacher: (element: RegexElement) throws -> Void) rethrows {
    // ... more funcs
}

let regex = Regex(RegexElement.special(.startOfLine)) // Normally a regex literal
let asStringArray = regex.map("Example") { element -> String in // Returns `["^"]` in example
    switch element {
    case let .capture(_, v): return v
    case let .special(s): return s.rawValue
    }
}

The advantages are:
  We get a literal type for convenience.
  We get a programatic type when we need to manipulate regexes.
  Breaking the regex matches into the enum defined elements of the regex works well with Swift pattern matching.
(Above is a very rough sketch!)

On 2 February 2016 at 16:44, Thorsten Seitz via swift-evolution <swift-evolution@swift.org <>> wrote:
Something like Scala's extractors or F#'s Active Patterns would be most welcome to generalize pattern matching.

Redirecting…
F Sharp Programming/Active Patterns - Wikibooks, open books for an open world

-Thorsten

Am 01.02.2016 um 15:46 schrieb James Campbell via swift-evolution <swift-evolution@swift.org <>>:

It would be great if we could create a generic way of making this swifty. You may let say want to implement a matching system for structure like JSON or XML (i.e XQuery).

___________________________________

James⎥Lead Engineer

james@supmenow.com <>⎥supmenow.com <http://supmenow.com/&gt;
Sup

Runway East >>>>

10 Finsbury Square

London

>>>> EC2A 1AF

On Mon, Feb 1, 2016 at 2:43 PM, Patrick Gili via swift-evolution <swift-evolution@swift.org <>> wrote:
Hi Dany,

My response is inline below.

Cheers,
-Patrick

On Jan 31, 2016, at 8:56 PM, Dany St-Amant <dsa.mls@icloud.com <>> wrote:

Le 31 janv. 2016 à 16:46, Patrick Gili <gili.patrick.r@gili-labs.com <>> a écrit :

Hi Dany,

Please find my response inline below.

Cheers,
-Patrick

On Jan 31, 2016, at 3:46 PM, Dany St-Amant via swift-evolution <swift-evolution@swift.org <>> wrote:

This seem to be two proposals in one:
1. Initialize NSRegularExpression with a single String which includes options

The ultimate goal based on the earlier mail in the thread seems to be able in a future proposal do thing like: string ~= replacePattern, if string =~ pattern, decoupled from the legacy Obj-C. Isn’t NSRegularExpression part of the legacy? The conversion of the literal string as regular expression should probably part of the proposal for these operators; as this is the time we will know how we want the text to be interpreted.

I don't see any evidence of NSRegularExpression becoming part of any legacy. Given SE-005, SE-006, and SE-023, the name is probably changing from NSRegularExpression to RegularExpression. However, I don't think the definition of the class will change, only the name.

I would like to see an operator regular expression matching operator, like Ruby and Perl. I was trying to keep the proposal a minimal increment that would buy the biggest bang for the buck. We can already accomplish much of what other languages can do with regard to regular expression. However, the notion of a regular expression isn't something we can work around with custom library today. Can you suggest something addition that should be in the proposal?

Splitting proposal in smaller ones have its advantage, but here I am just wondering if we are sure that these future operation will use the NSRegularExpression/RegularExpression. And does the currently selected syntax allow for future expansion, it would be bad to introduce something that need to be torn away or changed in an incompatible way, once we really start to use them in their final location.

The proposal is focused on the search, but seem to skip the substitution; I am unable to see an option to replace all matches instead of the first one only in the proposal. I, as many other, would expect regular expression in a language to also support substitution.

As for addition to the proposal, the processing of the string could be support for any character (within some limit) for the slash delimiter. With sed, when replacing path component, one can do: echo $PWD | sed -e "s:^/usr/local/bin:/opt/share/bin:g", instead of escaping every single slashes. Which is really handy to make thing easier to read.

Also, putting aside that I think \(scheme) should not be interpreted in the example, with a syntax allowing such interpretation the variable should be processed to generate proper escaping. If one is to use \(filename) you get "main.c", but one must use \(filename.escaped()) to get the proper "main\.c" to avoid matching "mainac". The String.escaped() must be in a format compatible with the format used when converting the regular expression into NSRegularExpression (not sure if the two syntax are the same; I think that at least the handling of / may differ)

I agree. Perhaps I went too far with keeping the proposal short-and-sweet. Especially when you consider the rich syntax that Perl supports for substitution.

2. Easily create a String without escaping (\n is not linefeed, but \ and n)

The ability to not interpret the backslash as escape can be useful in other scenario that creating a NSRegularExpression; like creating a Windows pathname, or creating regular expression which are then given to external tool. So this part of the proposal should probably be generalized.

Generalize it for what? If you're thinking along the line of raw strings, I agree that we need this capability, as well as multi-line string literals. However, I just soon we have separate proposals for this.

My point/opinion here, is that a regular expressions are just a String which are then interpreted; the same way as "Good Morning", "Bonjour", or "Marhaba" (even when using the arabic script) are just String when you assign then to a variable in Swift, and then interpreted by the intended user. They are not String, frenchString, rigthToLeftString. So I do not see why a regular expression should have privileged treatment and have its own language level syntax. The only difference when writing regular expression, or Windows pathname, or any String with a syntax with heavily uses of backslashes, is that one may want to disable the special meaning of the backslashes, to make thing more readable.

On the page of geeky-ing the String there’s four main part IMHO
- multi-line support
- no backslash escaping version (which should include no processing the \(variable) format)
- inclusion of String delimiter inside the String
- concat of backslash/no backslash version. Bash example echo 'echo "$BASH" shows '"$BASH"

I’m still trying to find back the mail thread crumbs on these topics, since before restarting the discussion in these topics, the previous one should be properly summarized; unless such summary already exist.

I think supporting interpolation is important. Both Perl and Ruby support it, and I'm sure there are other languages. One thing I forgot to put into the proposal: an option to disable interpolation or limit it to single pass.

Looking ahead at the other responses, Chris Lattner has suggested that the proposal would have more traction if we can find a way to fold this into Swift's pattern matching. I can't say as I disagree, as this makes regular expression more Swifty.

Regards,
Dany

Dany

Le 31 janv. 2016 à 12:18, Patrick Gili via swift-evolution <swift-evolution@swift.org <>> a écrit :

Here is the link to the proposal on GitHub:

https://github.com/gili-patrick-r/swift-evolution/blob/master/proposals/NNNN-regular-expression-literals.md

Cheers,
-Patrick

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org <>
https://lists.swift.org/mailman/listinfo/swift-evolution

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org <>
https://lists.swift.org/mailman/listinfo/swift-evolution

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org <>
https://lists.swift.org/mailman/listinfo/swift-evolution

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org <>
https://lists.swift.org/mailman/listinfo/swift-evolution

--
  -- Howard.
_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org <>
https://lists.swift.org/mailman/listinfo/swift-evolution

--
  -- Howard.

--
  -- Howard.
_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution

Hi Howard,

I'm not saying the two methods don't have to be exclusive. However, you asked us to consider converting regex literals into Swift Verbal Expressions. My response highlighted potential issues with this approach.

Cheers,
-Patrick

···

On Feb 2, 2016, at 6:53 PM, Howard Lovatt <howard.lovatt@gmail.com> wrote:

I don't see that the two have to be exclusive. If the design of the regex literal is suitable for both a traditional NSRegularExpression and a verbal type implementation then the two can co-exist. It can also be staged, so that a literal can be introduced first with a bridge to legacy NSRegularExpression and then later a verbal implementation could be added. The key is to design a liberal that is future proofed.

On 3 February 2016 at 10:33, Patrick Gili <gili.patrick.r@gili-labs.com <mailto:gili.patrick.r@gili-labs.com>> wrote:
I don't feel good about this direction for the following reasons:
1) Complexity
2) Maturity? I don't know how Verbal Expressions has been implemented. Does it leverage mature regex open source? Or, has it been written from scratch?
3) Performance? Compiling a regex literal typically results in a FSM of a sort, optimized to parse strings. I wouldn't think that converting a regex literal to Verbal Expressions would yield great performance every time a match or substitution is done.

-Patrick

On Feb 2, 2016, at 5:55 PM, Howard Lovatt <howard.lovatt@gmail.com <mailto:howard.lovatt@gmail.com>> wrote:

The difference is that I am proposing supporting both verbal expressions and regex literals and that - literals are converted to verbals and the processing happens at the verbal level. The reason for this is that verbals are easy to handle programmatically whilst literals are great for quickly specifying a regex.

On Tuesday, 2 February 2016, Patrick Gili <gili.patrick.r@gili-labs.com <mailto:gili.patrick.r@gili-labs.com>> wrote:
Hi Howard,

I don't see how this is very different from the Swift Verbal Expressions. It would suffer from the same disadvantages I have stated previously.

Cheers,
-Patrick

On Feb 2, 2016, at 1:51 AM, Howard Lovatt via swift-evolution <swift-evolution@swift.org <>> wrote:

Others have suggested a programatic regex instead of a regex literal, how about doing both? Something like:

enum RegexElement {
    case capture(name: String, value: String)
    case special(Special)
    // ...
    enum Special: String {
        case startOfLine = "^"
        // ...
        case endOfLine = "$"
    }
}

// Define a regexLiteral syntax that the compiler understands that is of type Regex and consists of String representations of RegexElements, e.g. using forward slash:
// /<RegexElements>*/

struct Regex: CustomStringConvertible { // Compiled, immutable, thread safe, and bridged to NSRegularExpression
    // ... internal compiled representation
    let elements: [RegexElement]
    var description: String {
        return RegexElement.Special.startOfLine.rawValue // Example. Really returns all the elements converted back to a string
    }
    init(_ elements: RegexElement...) {
        self.elements = elements // Example. Really also compiles the expression
    }
    // init(regexLiteral regex: Regex) {
    // init(concatAll regexes: Regex...) {
    // init(fromString string: String) {
    // ... more inits
    func map<T>(input: String, @noescape mapper: (element: RegexElement) throws -> T) rethrows -> [T] {
        return [try mapper(element: RegexElement.special(.startOfLine))] // Example. Really does the matching
    }
    // func flatMap<T>(input: String, @noescape mapper: (element: RegexElement) throws -> T?) rethrows -> [T] {
    // func flatMap<S: SequenceType>(input: String, @noescape mapper: (element: RegexElement) throws -> S) rethrows -> [S.Generator.Element] {
    // func forEach(input: String, @noescape eacher: (element: RegexElement) throws -> Void) rethrows {
    // ... more funcs
}

let regex = Regex(RegexElement.special(.startOfLine)) // Normally a regex literal
let asStringArray = regex.map("Example") { element -> String in // Returns `["^"]` in example
    switch element {
    case let .capture(_, v): return v
    case let .special(s): return s.rawValue
    }
}

The advantages are:
  We get a literal type for convenience.
  We get a programatic type when we need to manipulate regexes.
  Breaking the regex matches into the enum defined elements of the regex works well with Swift pattern matching.
(Above is a very rough sketch!)

On 2 February 2016 at 16:44, Thorsten Seitz via swift-evolution <swift-evolution@swift.org <>> wrote:
Something like Scala's extractors or F#'s Active Patterns would be most welcome to generalize pattern matching.

Redirecting…
F Sharp Programming/Active Patterns - Wikibooks, open books for an open world

-Thorsten

Am 01.02.2016 um 15:46 schrieb James Campbell via swift-evolution <swift-evolution@swift.org <>>:

It would be great if we could create a generic way of making this swifty. You may let say want to implement a matching system for structure like JSON or XML (i.e XQuery).

___________________________________

James⎥Lead Engineer

james@supmenow.com <>⎥supmenow.com <http://supmenow.com/&gt;
Sup

Runway East >>>>

10 Finsbury Square

London

>>>> EC2A 1AF

On Mon, Feb 1, 2016 at 2:43 PM, Patrick Gili via swift-evolution <swift-evolution@swift.org <>> wrote:
Hi Dany,

My response is inline below.

Cheers,
-Patrick

On Jan 31, 2016, at 8:56 PM, Dany St-Amant <dsa.mls@icloud.com <>> wrote:

Le 31 janv. 2016 à 16:46, Patrick Gili <gili.patrick.r@gili-labs.com <>> a écrit :

Hi Dany,

Please find my response inline below.

Cheers,
-Patrick

On Jan 31, 2016, at 3:46 PM, Dany St-Amant via swift-evolution <swift-evolution@swift.org <>> wrote:

This seem to be two proposals in one:
1. Initialize NSRegularExpression with a single String which includes options

The ultimate goal based on the earlier mail in the thread seems to be able in a future proposal do thing like: string ~= replacePattern, if string =~ pattern, decoupled from the legacy Obj-C. Isn’t NSRegularExpression part of the legacy? The conversion of the literal string as regular expression should probably part of the proposal for these operators; as this is the time we will know how we want the text to be interpreted.

I don't see any evidence of NSRegularExpression becoming part of any legacy. Given SE-005, SE-006, and SE-023, the name is probably changing from NSRegularExpression to RegularExpression. However, I don't think the definition of the class will change, only the name.

I would like to see an operator regular expression matching operator, like Ruby and Perl. I was trying to keep the proposal a minimal increment that would buy the biggest bang for the buck. We can already accomplish much of what other languages can do with regard to regular expression. However, the notion of a regular expression isn't something we can work around with custom library today. Can you suggest something addition that should be in the proposal?

Splitting proposal in smaller ones have its advantage, but here I am just wondering if we are sure that these future operation will use the NSRegularExpression/RegularExpression. And does the currently selected syntax allow for future expansion, it would be bad to introduce something that need to be torn away or changed in an incompatible way, once we really start to use them in their final location.

The proposal is focused on the search, but seem to skip the substitution; I am unable to see an option to replace all matches instead of the first one only in the proposal. I, as many other, would expect regular expression in a language to also support substitution.

As for addition to the proposal, the processing of the string could be support for any character (within some limit) for the slash delimiter. With sed, when replacing path component, one can do: echo $PWD | sed -e "s:^/usr/local/bin:/opt/share/bin:g", instead of escaping every single slashes. Which is really handy to make thing easier to read.

Also, putting aside that I think \(scheme) should not be interpreted in the example, with a syntax allowing such interpretation the variable should be processed to generate proper escaping. If one is to use \(filename) you get "main.c", but one must use \(filename.escaped()) to get the proper "main\.c" to avoid matching "mainac". The String.escaped() must be in a format compatible with the format used when converting the regular expression into NSRegularExpression (not sure if the two syntax are the same; I think that at least the handling of / may differ)

I agree. Perhaps I went too far with keeping the proposal short-and-sweet. Especially when you consider the rich syntax that Perl supports for substitution.

2. Easily create a String without escaping (\n is not linefeed, but \ and n)

The ability to not interpret the backslash as escape can be useful in other scenario that creating a NSRegularExpression; like creating a Windows pathname, or creating regular expression which are then given to external tool. So this part of the proposal should probably be generalized.

Generalize it for what? If you're thinking along the line of raw strings, I agree that we need this capability, as well as multi-line string literals. However, I just soon we have separate proposals for this.

My point/opinion here, is that a regular expressions are just a String which are then interpreted; the same way as "Good Morning", "Bonjour", or "Marhaba" (even when using the arabic script) are just String when you assign then to a variable in Swift, and then interpreted by the intended user. They are not String, frenchString, rigthToLeftString. So I do not see why a regular expression should have privileged treatment and have its own language level syntax. The only difference when writing regular expression, or Windows pathname, or any String with a syntax with heavily uses of backslashes, is that one may want to disable the special meaning of the backslashes, to make thing more readable.

On the page of geeky-ing the String there’s four main part IMHO
- multi-line support
- no backslash escaping version (which should include no processing the \(variable) format)
- inclusion of String delimiter inside the String
- concat of backslash/no backslash version. Bash example echo 'echo "$BASH" shows '"$BASH"

I’m still trying to find back the mail thread crumbs on these topics, since before restarting the discussion in these topics, the previous one should be properly summarized; unless such summary already exist.

I think supporting interpolation is important. Both Perl and Ruby support it, and I'm sure there are other languages. One thing I forgot to put into the proposal: an option to disable interpolation or limit it to single pass.

Looking ahead at the other responses, Chris Lattner has suggested that the proposal would have more traction if we can find a way to fold this into Swift's pattern matching. I can't say as I disagree, as this makes regular expression more Swifty.

Regards,
Dany

Dany

Le 31 janv. 2016 à 12:18, Patrick Gili via swift-evolution <swift-evolution@swift.org <>> a écrit :

Here is the link to the proposal on GitHub:

https://github.com/gili-patrick-r/swift-evolution/blob/master/proposals/NNNN-regular-expression-literals.md

Cheers,
-Patrick

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org <>
https://lists.swift.org/mailman/listinfo/swift-evolution

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org <>
https://lists.swift.org/mailman/listinfo/swift-evolution

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org <>
https://lists.swift.org/mailman/listinfo/swift-evolution

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org <>
https://lists.swift.org/mailman/listinfo/swift-evolution

--
  -- Howard.
_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org <>
https://lists.swift.org/mailman/listinfo/swift-evolution

--
  -- Howard.

--
  -- Howard.

I just stumbled on some interesting articles and notes about implementing
regular expressions, written by Russ Cox (Google, Go):
https://swtch.com/~rsc/regexp/

···

On Wed, Feb 3, 2016 at 2:06 PM, Patrick Gili via swift-evolution < swift-evolution@swift.org> wrote:

Hi Howard,

I'm not saying the two methods don't have to be exclusive. However, you
asked us to consider converting regex literals into Swift Verbal
Expressions. My response highlighted potential issues with this approach.

Cheers,
-Patrick

On Feb 2, 2016, at 6:53 PM, Howard Lovatt <howard.lovatt@gmail.com> wrote:

I don't see that the two have to be exclusive. If the design of the regex
literal is suitable for both a traditional NSRegularExpression and a verbal
type implementation then the two can co-exist. It can also be staged, so
that a literal can be introduced first with a bridge to legacy
NSRegularExpression and then later a verbal implementation could be added.
The key is to design a liberal that is future proofed.

On 3 February 2016 at 10:33, Patrick Gili <gili.patrick.r@gili-labs.com> > wrote:

I don't feel good about this direction for the following reasons:
1) Complexity
2) Maturity? I don't know how Verbal Expressions has been implemented.
Does it leverage mature regex open source? Or, has it been written from
scratch?
3) Performance? Compiling a regex literal typically results in a FSM of a
sort, optimized to parse strings. I wouldn't think that converting a regex
literal to Verbal Expressions would yield great performance every time a
match or substitution is done.

-Patrick

On Feb 2, 2016, at 5:55 PM, Howard Lovatt <howard.lovatt@gmail.com> >> wrote:

The difference is that I am proposing supporting both verbal expressions
and regex literals and that - literals are converted to verbals and the
processing happens at the verbal level. The reason for this is that verbals
are easy to handle programmatically whilst literals are great for quickly
specifying a regex.

On Tuesday, 2 February 2016, Patrick Gili <gili.patrick.r@gili-labs.com> >> wrote:

Hi Howard,

I don't see how this is very different from the Swift Verbal
Expressions. It would suffer from the same disadvantages I have stated
previously.

Cheers,
-Patrick

On Feb 2, 2016, at 1:51 AM, Howard Lovatt via swift-evolution < >>> swift-evolution@swift.org> wrote:

Others have suggested a programatic regex instead of a regex literal,
how about doing both? Something like:

enum RegexElement {
    case capture(name: String, value: String)
    case special(Special)
    // ...
    enum Special: String {
        case startOfLine = "^"
        // ...
        case endOfLine = "$"
    }
}

// Define a regexLiteral syntax that the compiler understands that is of
type Regex and consists of String representations of RegexElements, e.g.
using forward slash:
// /<RegexElements>*/

struct Regex: CustomStringConvertible { // Compiled, immutable, thread
safe, and bridged to NSRegularExpression
    // ... internal compiled representation
    let elements: [RegexElement]
    var description: String {
        return RegexElement.Special.startOfLine.rawValue // Example.
Really returns all the elements converted back to a string
    }
    init(_ elements: RegexElement...) {
        self.elements = elements // Example. Really also compiles the
expression
    }
    // init(regexLiteral regex: Regex) {
    // init(concatAll regexes: Regex...) {
    // init(fromString string: String) {
    // ... more inits
    func map<T>(input: String, @noescape mapper: (element: RegexElement)
throws -> T) rethrows -> [T] {
        return [try mapper(element: RegexElement.special(.startOfLine))]
// Example. Really does the matching
    }
    // func flatMap<T>(input: String, @noescape mapper: (element:
RegexElement) throws -> T?) rethrows -> [T] {
    // func flatMap<S: SequenceType>(input: String, @noescape mapper:
(element: RegexElement) throws -> S) rethrows -> [S.Generator.Element] {
    // func forEach(input: String, @noescape eacher: (element:
RegexElement) throws -> Void) rethrows {
    // ... more funcs
}

let regex = Regex(RegexElement.special(.startOfLine)) // Normally a
regex literal
let asStringArray = regex.map("Example") { element -> String in //
Returns `["^"]` in example
    switch element {
    case let .capture(_, v): return v
    case let .special(s): return s.rawValue
    }
}

The advantages are:

   1. We get a literal type for convenience.
   2. We get a programatic type when we need to manipulate regexes.
   3. Breaking the regex matches into the enum defined elements of
   the regex works well with Swift pattern matching.

(Above is a very rough sketch!)

On 2 February 2016 at 16:44, Thorsten Seitz via swift-evolution < >>> swift-evolution@swift.org> wrote:

Something like Scala's extractors or F#'s Active Patterns would be most
welcome to generalize pattern matching.

Redirecting…
F Sharp Programming/Active Patterns - Wikibooks, open books for an open world

-Thorsten

Am 01.02.2016 um 15:46 schrieb James Campbell via swift-evolution < >>>> swift-evolution@swift.org>:

It would be great if we could create a generic way of making this
swifty. You may let say want to implement a matching system for structure
like JSON or XML (i.e XQuery).

*___________________________________*

*James⎥Lead Engineer*

*james@supmenow.com⎥supmenow.com <http://supmenow.com/&gt;\*

*Sup*

*Runway East *

*10 Finsbury Square*

*London*

* EC2A 1AF *

On Mon, Feb 1, 2016 at 2:43 PM, Patrick Gili via swift-evolution < >>>> swift-evolution@swift.org> wrote:

Hi Dany,

My response is inline below.

Cheers,
-Patrick

On Jan 31, 2016, at 8:56 PM, Dany St-Amant <dsa.mls@icloud.com> wrote:

Le 31 janv. 2016 à 16:46, Patrick Gili <gili.patrick.r@gili-labs.com> >>>>> a écrit :

Hi Dany,

Please find my response inline below.

Cheers,
-Patrick

On Jan 31, 2016, at 3:46 PM, Dany St-Amant via swift-evolution < >>>>> swift-evolution@swift.org> wrote:

This seem to be two proposals in one:
1. Initialize NSRegularExpression with a single String which includes
options

The ultimate goal based on the earlier mail in the thread seems to be
able in a future proposal do thing like: string ~= replacePattern, if
string =~ pattern, decoupled from the legacy Obj-C. Isn’t
NSRegularExpression part of the legacy? The conversion of the literal
string as regular expression should probably part of the proposal for these
operators; as this is the time we will know how we want the text to be
interpreted.

I don't see any evidence of NSRegularExpression becoming part of any
legacy. Given SE-005, SE-006, and SE-023, the name is probably changing
from NSRegularExpression to RegularExpression. However, I don't think the
definition of the class will change, only the name.

I would like to see an operator regular expression matching operator,
like Ruby and Perl. I was trying to keep the proposal a minimal increment
that would buy the biggest bang for the buck. We can already accomplish
much of what other languages can do with regard to regular expression.
However, the notion of a regular expression isn't something we can work
around with custom library today. Can you suggest something addition that
should be in the proposal?

Splitting proposal in smaller ones have its advantage, but here I am
just wondering if we are sure that these future operation will use the
NSRegularExpression/RegularExpression. And does the currently selected
syntax allow for future expansion, it would be bad to introduce something
that need to be torn away or changed in an incompatible way, once we
really start to use them in their final location.

The proposal is focused on the search, but seem to skip the
substitution; I am unable to see an option to replace all matches instead
of the first one only in the proposal. I, as many other, would expect
regular expression in a language to also support substitution.

As for addition to the proposal, the processing of the string could be
support for any character (within some limit) for the slash delimiter. With
sed, when replacing path component, one can do: echo $PWD | sed -e
"s:^/usr/local/bin:/opt/share/bin:g", instead of escaping every
single slashes. Which is really handy to make thing easier to read.

Also, putting aside that I think \(scheme) should not be interpreted
in the example, with a syntax allowing such interpretation the variable
should be processed to generate proper escaping. If one is to use
\(filename) you get "main.c", but one must use \(filename.escaped()) to get
the proper "main\.c" to avoid matching "mainac". The String.escaped() must
be in a format compatible with the format used when converting the regular
expression into NSRegularExpression (not sure if the two syntax are the
same; I think that at least the handling of / may differ)

I agree. Perhaps I went too far with keeping the proposal
short-and-sweet. Especially when you consider the rich syntax that Perl
supports for substitution.

2. Easily create a String without escaping (\n is not linefeed, but \
and n)

The ability to not interpret the backslash as escape can be useful in
other scenario that creating a NSRegularExpression; like creating a Windows
pathname, or creating regular expression which are then given to external
tool. So this part of the proposal should probably be generalized.

Generalize it for what? If you're thinking along the line of raw
strings, I agree that we need this capability, as well as multi-line string
literals. However, I just soon we have separate proposals for this.

My point/opinion here, is that a regular expressions are just a String
which are then interpreted; the same way as "Good Morning", "Bonjour", or
"Marhaba" (even when using the arabic script) are just String when you
assign then to a variable in Swift, and then interpreted by the intended
user. They are not String, frenchString, rigthToLeftString. So I do not see
why a regular expression should have privileged treatment and have its own
language level syntax. The only difference when writing regular expression,
or Windows pathname, or any String with a syntax with heavily uses of
backslashes, is that one may want to disable the special meaning of the
backslashes, to make thing more readable.

On the page of geeky-ing the String there’s four main part IMHO
- multi-line support
- no backslash escaping version (which should include no processing
the \(variable) format)
- inclusion of String delimiter inside the String
- concat of backslash/no backslash version. Bash example echo 'echo
"$BASH" shows '"$BASH"

I’m still trying to find back the mail thread crumbs on these topics,
since before restarting the discussion in these topics, the previous one
should be properly summarized; unless such summary already exist.

I think supporting interpolation is important. Both Perl and Ruby
support it, and I'm sure there are other languages. One thing I forgot to
put into the proposal: an option to disable interpolation or limit it to
single pass.

Looking ahead at the other responses, Chris Lattner has suggested that
the proposal would have more traction if we can find a way to fold this
into Swift's pattern matching. I can't say as I disagree, as this makes
regular expression more Swifty.

Regards,
Dany

Dany

Le 31 janv. 2016 à 12:18, Patrick Gili via swift-evolution < >>>>> swift-evolution@swift.org> a écrit :

Here is the link to the proposal on GitHub:

https://github.com/gili-patrick-r/swift-evolution/blob/master/proposals/NNNN-regular-expression-literals.md

Cheers,
-Patrick

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution

--
  -- Howard.
_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution

--
  -- Howard.

--
  -- Howard.

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution

--
bitCycle AB | Smedjegatan 12 | 742 32 Östhammar | Sweden

Phone: +46-73-753 24 62
E-mail: jens@bitcycle.com