PatternKit is a library for matching patterns in collections. Patterns have many of the basic capabilities of regular expressions, but have a different syntax and can match on any Collection
of Hashable
elements.
PatternKit is a proof of concept. It is in an early stage of design and should not be used in production. It currently supports the fundamental features of a regular expression
The matches(with:)
method returns a lazy collection with all non-overlapping matches for the pattern throughout the collection being searched.
You can make a pattern for a simple sequence of elements with the postfix /
operator.
let greetingCount = text.matches(with: "hi"/).count
matches(with:)
returns a series of PatternMatch
instances; each contains a SubSequence
of the original collection, including its range, which matched the pattern.
Use |
to match either of two patterns. The leftmost pattern is preferred.
for match in text.matches(with: "hello"/ | "hi"/) {
print(match.contents)
}
PatternMatch
also contains a range
for the match.
Use +
to join two patterns together. The first pattern must match first, then the second.
for match in text.matches(with: ("hello"/ | "hi"/) + ", world!"/) {
print(match.range)
}
The C.pattern(_:)
method lets you pre-construct a pattern to match against a collection of type C
. You can use this for many purposes, but it's especially handy for subpatterns you use in several places.
Use any(where:)
, any(of:)
, any(_:)
, or any()
to test an element against an arbitrary predicate, membership in a collection, existence in a list, or simply accept any element. You can negate with a prefix !
.
let d = String.pattern(any(of: "0123456789"))
let phoneNumbers = text.matches(with: d+d+d + "-"/ + d+d+d + "-"/ + d+d+d+d)
.map { $0.contents }.sorted()
Match a subpattern 0 times or more with *
, 1 time or more with +
, 0 or 1 times with %
, or any other number of times with the repeating(_: Int)
or repeating(_: RangeExpression)
methods. Repetitions are greedy and support all the backtracking behavior you would expect from a regular expression.
let animaniacsCatcalls =
text.matches(with: "Hell"/ + ("o"/)+ + ", Nurse!")
.map { $0.contents }
Patterns can be matched in any Collection
of Hashable
elements not just strings.
PatternKit offers a number of convenience methods on Collection
, including firstMatch(with:)
, ranges(with:)
, range(with:)
, index(with:)
, and contains(_:)
.
if numbers.contains([1, 2, 3, 4, 5]/) {
print("That's the stupidest combination I've ever heard in my life!")
}
You can create your own pattern types by conforming to the PatternProtocol
protocol. See its documentation for details.
When implementing patterns, you may find the trace(…)
function useful. It wraps a subpattern so that it prints information about each match it makes.
This library is not production-ready. It is a proof of concept and its interface is likely to change radically in the future. Nor is there any guarantee it will ever be production-ready—that depends on whether people think it might be useful.
- Capturing the matches of a subpattern
- Anchored matches
- Lazy and possessive repetititon
- I'd love to get rid of the
/
operator and support treating anyCollection
as a pattern matching its elements, but even with conditional conformance, there's no space for theTarget
associated type. - Case-insensitive, diacritical-insensitive, and locale-aware matching—passing an
areEquivalent
function at the pattern's top would do it, but that would breakHashable
-based features likeany(…)
and the internalSearchSkipper
type.
- Do we need the
PatternMatch
type, or can we just return aSubSequence
? This probably depends on how capturing works. - How many matching methods do we want, and with what sort of naming convention?
- More thorough tests
- Complete documentation
- Examine performance characteristics
Brent Royal-Gordon, Architechies.
© 2017 Architechies. All Rights Reserved.
Distributed under the terms of the MIT License. See LICENSE file for details.