Giter VIP home page Giter VIP logo

regex's Introduction

Regex

Pattern match like a boss.

Usage

Create:

// Use `Regex.init(_:)` to build a regex from a static pattern

let greeting = Regex("hello (world|universe)")

// Use `Regex.init(string:)` to construct a regex from dynamic data, and
// gracefully handle invalid input

var validations: [String: Regex]

for (name, pattern) in config.loadValidations() {
  do {
    validations[name] = try Regex(string: pattern)
  } catch {
    print("error building validation \(name): \(error)")
  }
}

Match:

if greeting.matches("hello universe!") {
  print("wow, you're friendly!")
}

Pattern match:

switch someTextFromTheInternet {
case Regex("DROP DATABASE (.+)"):
  // TODO: patch security hole
default:
  break
}

Capture:

let greeting = Regex("hello (world|universe|swift)")

if let subject = greeting.firstMatch(in: "hello swift")?.captures[0] {
  print("ohai \(subject)")
}

Find and replace:

"hello world".replacingFirst(matching: "h(ello) (\\w+)", with: "H$1, $2!")
// "Hello, world!"

Accessing the last match:

switch text {
case Regex("hello (\\w+)"):
  if let friend = Regex.lastMatch?.captures[0] {
    print("lovely to meet you, \(friend)!")
  }
case Regex("goodbye (\\w+)"):
  if let traitor = Regex.lastMatch?.captures[0] {
    print("so sorry to see you go, \(traitor)!")
  }
default:
  break
}

Options:

let totallyUniqueExamples = Regex("^(hello|foo).*$", options: [.ignoreCase, .anchorsMatchLines])
let multilineText = "hello world\ngoodbye world\nFOOBAR\n"
let matchingLines = totallyUniqueExamples.allMatches(in: multilineText).map { $0.matchedString }
// ["hello world", "FOOBAR"]

Decode:

let json = """
  [
    {
      "name" : "greeting",
      "pattern" : "^(\\\\w+) world!$"
    }
  ]
  """.data(using: .utf8)!

struct Validation: Codable {
  var name: String
  var pattern: Regex
}

let decoder = JSONDecoder()
try decoder.decode(Validation.self, from: json)
// Validation(name: "greeting", pattern: /^(\w+) world!/)

Ranges:

let lyrics = """
  So it's gonna be forever
  Or it's gonna go down in flames
  """

let possibleEndings = Regex("it's gonna (.+)")
    .allMatches(in: lyrics)
    .flatMap { $0.captureRanges[0] }
    .map { lyrics[$0] }

// it's gonna: ["be forever", "go down in flames"]

Installation

Regex supports Swift 4.2 and above, on all Swift platforms.

Swift Package Manager

Add a dependency to your Package.swift:

let package = Package(
  name: "MyPackage",
  dependencies: [
    // other dependencies...
    .package(url: "https://github.com/sharplet/Regex.git", from: "2.1.0"),
  ]
)

Carthage

Put this in your Cartfile:

github "sharplet/Regex" ~> 2.1

CocoaPods

Put this in your Podfile:

pod "STRegex", "~> 2.1"

Contributing

See CONTRIBUTING.md.

Development Setup

Swift Package Manager

Build and run the tests:

swift test

# or just

rake test:package

If you're on a Mac, testing on Linux is supported via Docker for Mac. Once Docker is set up, start a Linux shell:

rake docker

And run the tests via Swift Package Manager.

Xcode

xcpretty is recommended, for prettifying test output:

gem install xcpretty

Then run the tests:

# one of
rake test:osx
rake test:ios
rake test:tvos

Formatting & Linting

Regex uses SwiftFormat to maintain consistent code formatting.

Regex uses SwiftLint to validate code style. SwiftLint is automatically run against pull requests using Hound CI.

When submitting a pull request, running these tools and addressing any issues is much appreciated!

brew bundle
swiftformat .
swiftlint

License

See LICENSE.txt.

regex's People

Contributors

alexeyvasilyevrn avatar andrew804 avatar mezhevikin avatar sclukey avatar sharplet avatar tonyarnold avatar wspl avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

regex's Issues

Add a changelog

This one would be great place to start for anyone looking to make a contribution.

See http://keepachangelog.com/ for the format I'd like to use.

I've been keeping an informal changelog in the release notes, so that would probably be a great place too start. I'd like the changelog to retroactively cover all existing releases, and then CONTRIBUTING.md can be updated to request that the changelog be updated with each pull request.

Regex isn't on CocoaPods trunk

  • You can still use Regex directly in your Podfile, which is great;
  • BUT, you can't yet transitively depend on Regex from another pod.

Long-term, I've started discussions with the author of https://github.com/brynbellomy/Regex about the possibility of merging the two projects.

Short-term, I can currently think of two options:

  • Just prefix the podspec and push it to Trunk. The project would still be called Regex officially, but the pod would be named something like STRegex. This would be pretty simple, though still potentially quite confusing for users.
  • Maintain a spec repo dedicated to Regex and ask users to specify a custom source in their Podfile. However I don't believe this solves the namespace collision problem with the master spec repo. I'd like to investigate this further.

Regex shouldn't be StringLitteralConvertible

As convenient as this may seem, Regex should not be a StringLiteralConvertible.

Consider this code:

func crasher(s: String) {
    switch s {
    case "*starred*":
         print("starred")
    case "any":
         print("any")
    default:
        print("unknown")
    }
}

Calling the crasher function crashes the app. Why? Because the case string is being silently converted to a Regex instance by the way of its StringLiteralConvertible initializer!

In other words, simply using Regex in an application silently turns all case string tests into Regex instances.

This has two drawbacks:

  • it hampers performance when not needed
  • it risks crashing the application, like in the example above, because the string is not a valid regex.

Moreover, your example correctly shows a correct case statement using an explicit Regex instance.

Simply removing conformance to the StringLiteralConvertible protocol will fix this issue.

Thanks for a nice µframework!

public struct Options has generic name that cannot be disambiguated

Because of Regex is both the name of the struct and the module, then as discussed in Swift bug SR-5304 Regex's struct Option cannot be disambiguated using Regex.Options. Method to workaround this issue are less than ideal https://stackoverflow.com/a/38508385/592739

Renaming either the module or the struct would break existing code. Would it be terrible to rename Options to something like RegexOptions. Or instead adding a public typealias RegexOptions = Options might work.

Swift PM Intergration

Cannot import Regex into my project.
swift package update command tells me:
error: the package has an unsupported layout, unexpected source file(s) found: /blablabla/Packages/Regex-0.3.1/Tests/OptionsSpec.swift, /blablabla/Packages/Regex-0.3.1/Tests/RegexSpec.swift, /blablabla/Packages/Regex-0.3.1/Tests/StringReplacementSpec.swift fix: move the file(s) inside a module

UTF16View

Hi,

First off, I love this library and use it all the time. Great job! Also good to see recent maintenance!

This morning, I was reading a post about Swift 5 strings and noticed a comment about using UTF8View instead of UTF16View in some cases: https://swift.org/blog/utf8-string/#impact-on-existing-code

MatchResult.swift uses UTF16View in several places. Is this something that should be refactored per the article?

Respectfully,
David

NSAttributedString support?

Accessing the underlying matched NSRange for use with NSAttributedString has come up in both #26 and #37.

At this point I still consider the Foundation types implementation details. There are gotchas related the use of UTF-16, and I'd prefer to err on the side of safety where it's not possible to make that kind of mistake.

There are probably two main considerations stopping me from adding NSAttributedString support at the moment:

  • So far I've endeavoured to only expose types in the standard library in the public API. NSAttributedString isn't bridged, so I've avoided supporting it. If there was a first-class attributed string value type in the standard library, I think I'd be willing to support it.
  • At this point I'm still interested in reserving the right to replace the regular expression engine with something else, like PCRE.

In a world where Foundation is fully supported and cross-platform (and NSRegularExpression is implemented—it wasn't last time I checked), these reasons will start to hold less water.

Update: Moving away from NSRegularExpression at this stage is unlikely. So the main issue here seems like an API design one.

NSRegularExpressionOptions?

I'm personally fairly happy with the defaults, but if there's enough demand for some options it would be good to look into.

Case (in)sensitivity is one consideration.

We should be using Substring

While working on #69, I realised that we’re doing far more string copying than we should be. The built in Substring type is a perfect way to represent substrings as “real” strings in a performant way. MatchResult is a bundle of ranges that map onto a single string, and should really be exposing Substring in its API as much as possible to avoid the many unnecessary copies.

It would have been nice if I’d noticed this before 2.0, but unfortunately that ship has sailed. If we get another opportunity to make breaking changes, this is a perfect one.

Problem installing via cocoapods (fatal: No names found, cannot describe anything)

Hey,

When attempting to install via cocoapods, I'm encountering the following error:

Updating local specs repositories
Analyzing dependencies
Pre-downloading: `Regex` from `https://github.com/sharplet/Regex.git`
fatal: No names found, cannot describe anything.
[!] Unable to satisfy the following requirements:

- `Regex (from `https://github.com/sharplet/Regex.git`)` required by `Podfile`

I'm kind of new to ios programming -any idea what could be going wrong?

Specify swift version

Getting the following version during build. I can get around that by specifying the version in the consuming pod, but would be nice if the reason was fixed in the dependency itself:

    - `STRegex` does not specify a Swift version and none of the targets (`Runner`) integrating it have the `SWIFT_VERSION` attribute set. Please contact the author or set the `SWIFT_VERSION`
    attribute in at least one of the targets that integrate this pod.

String find and replace

I think it would be worth supporting some find and replace functions with support for back-references. Here's the API I have in mind:

extension String {

  /// Replace the first occurrence of `pattern` with `replacement`.
  func stringByReplacingFirstMatching(pattern: String, with replacement: String) -> String

  /// Replace all occurrences of `pattern` with `replacement`.
  func stringByReplacingAllMatching(pattern: String, with replacement: String) -> String

  /// Mutating form of `stringByReplacingFirstMatching(_:with:)`.
  mutating func replaceFirstMatching(pattern: String, with replacement: String)

  /// Mutating form of `stringByReplacingAllMatching(_:with:)`.
  mutating func replaceAllMatching(pattern: String, with replacement: String)

}

We should be able to implement this with NSRegularExpression.stringByReplacingMatchesInString(_:options:range:withTemplate:) and NSRegularExpression.replaceMatchesInString(_:options:range:withTemplate:).

Cocoapod support for :git

Using the :git tag in a podfile does not work. Receive the following errors in command line of git update:

fatal: No names found, cannot describe anything.

[!] Unable to satisfy the following requirements:

  - `STRegex (from `https://github.com/sharplet/Regex.git`)` required by `Podfile`

Linux support

So, some day it looks like NSRegularExpression will be ported to Linux. \o/

At the time of writing, the implementation is stubbed out and throws a runtime error: https://github.com/apple/swift-corelibs-foundation/blob/0254bc20d453626494d0e51d0cae35cc61ee3e46/Foundation/NSRegularExpression.swift.

Theoretically this should mean that you can compile and link against Regex on Linux, but when I tested it I got some linker errors. For now, I've filed a bug here: https://bugs.swift.org/browse/SR-82.

It's not possible to recover from an invalid regex

I've been thinking about this scenario, which I'd like to be able to support:

  1. App is loading regular expressions from an external source, say a local or remote config file;
  2. After loading the config, attempt to parse it as a Regex;
  3. Recovery from a failure should be possible. The decision to crash or not is really an application-specific one, and shouldn't be imposed by this library.

So I can see the following possibilities:

  1. Regex.init is updated to return an implicitly-unwrapped optional. Statically defined Regex are still as convenient to construct as they are presently.
  2. Use a regular optional failable initialiser, rather than an implicitly unwrapped optional. Callers can still force-unwrap when using static regex.
  3. Use a throwing initialiser. The simplest way would be to re-throw the underlying NSRegularExpression error. try? and try! are still usable for increased convenience.
    3a. Like 3, but instead of exposing the underlying error, introduce a custom ErrorType.

I'm currently leaning towards option 2, which I believe straddles the line between convenience and safety fairly well. However there's some risk of losing detailed failure information here.

Options 2 and 3 could be combined to form a hybrid approach where there are two initialisers, one with an external parameter label (say pattern:), and one without (like today). However I think this unnecessarily complicates the API, and I'd prefer to err on the side of the simplicity we have today. I'd also prefer to avoid breaking encapsulation by exposing the underlying NSRegularExpression error, but introducing an error type feels like overkill to me.

I don't think option 1 feels like a very idiomatic Swift API. In general, safety should probably be the default position.

Different replace behavior

Hi,

I'm trying to remove the contents of a string matching the following regex: (\((?:[A-N]\d?(?:, )?)+\)).

The idea is to remove these allergen-notices from a string like this:

Frikadelle von Rind und Schwein (A, A1, C, I, J) mit Paprika-Zwiebelgemüse (J), dazu Spätzle (A, A1, C)

I tried the following:

let regex = Regex(#" (\((?:[A-N]\d?(?:, )?)+\))"#)
let title = oldTitle.replacingAll(matching: regex, with: "")

For some reason this leads to "Paprika-Zwiebelgemüse (J)" being turned into "Paprika-Zwiebelgemüs)" and not "Paprika-Zwiebelgemüse" as expected.

When I use the native String API (.replacingOccurrences(of: #" (\((?:[A-N]\d?(?:, )?)+\))"#, with: "", options: .regularExpression)), it works like it should.

Is this a bug in Regex or am I doing something wrong? This is on iOS in case that's relevant.

Thanks!

'advanced(by:)' is unavailable (Swift 4)

In MatchResult.swift 103~108:

  private func utf16Range(from range: NSRange) -> Range<String.UTF16Index>? {
    guard range.location != NSNotFound else { return nil }
    let start = string.startIndex.advanced(by: range.location)
    let end = start.advanced(by: range.length)
    return start..<end
  }

106:21:
'advanced(by:)' is unavailable
'advanced(by:)' was obsoleted in Swift 4.0

Regex.matches does not exist

I was testing the Regex object and the README has an example using matches but it does not exist.

Regex("Custom from:([0-9]+) to:([0-9]+)").matches(value) //error

Another thing, is it normal the first capture to be the full value?

let matchResult = Regex("Custom from:([0-9]+) to:([0-9]+)").match(value)
// value e.g.: Custom from:1451811437 to:1452329837

print(matchResult.captures.count) //prints 3
print(matchResult.captures[0]) //prints Custom from:1451811437 to:1452329837
print(matchResult.captures[1]) //prints 1451811437
print(matchResult.captures[2]) //prints 1452329837

Crash when matching LF against CRLF

The following code crashes, because the matched string breaks extended grapheme cluster. The range of swift string is invalid.

let source = "\r\n"
let regex = Regex("\n")
let match = regex.firstMatch(in: source)!
let matchedString = match.matchedString // crash here
XCTAssertEqual(matchedString, "\n")

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.