nikic / iter Goto Github PK
View Code? Open in Web Editor NEWIteration primitives using generators
License: Other
Iteration primitives using generators
License: Other
search()
only finds the item you're searching for in the values of the iterable. e.g. search('foo', it)
can only find 'foo'
, or nothing. If we want to know the key of the entry foo
, there is currently no way to do this. searchKey('foo', it)
would return the key pointing to the found 'foo'
.
@nikic Would you accept this feature?
@nikic Is there anything you need help with to get a 1.6 release? Just wondering when we see a 1.6 release with the latest code that was merged in.
Thanks for the awesome library btw
The current release does not contain the search function - it would be nice if a new release could be tagged that includes this.
Similar to the iter\flatten
, except it flattens by 1 step. This is very useful in many areas where your generators are yielding arrays and you want that to be unit of data to work with.
Here's an implementation:
function concat ($iterable) {
return iter\reduce(
function ($chainedIterables, $nextIterable) {
return iter\chain($chainedIterables, $nextIterable);
},
$iterable,
[]
);
}
This should be equivalent to a foldl concat implementation in Haskell.
Since PHP7.1, which is required in composer.json
the iterable
type hint is supported.
Would you:
Or, if not 1:
iterable
as an option in the PHPDoc (since array|Traversable
does not include iterable
).?
It's likely that fn
will become a reserved keyword due to https://wiki.php.net/rfc/arrow_functions_v2, so iter\fn
should be renamed to iter\func
for version 2.0.
One super common use case is finding the index at which you need to insert something into an array.
Built-in array_search()
does this if you are looking for an element by exact value or identity. What is dearly missing is a function that returns the key of the first element that matches a user-defined predicate. While it is trivial to write a simple loop, the frequency with which this is needed (particularly when writing complex refactorings using nikic/php-parser) adds up to some cognitive load. It would be great to have a function that DoesJustThat™, and as part of a widely used library too, rather than as a copypaste in every project's myfuncs.php
.
There have been some additions since v1.0.0 (e.g. chunk()). It would be great to have a tag for those features.
Would it be nice to have some kind of fluent interface?
That would allow for really readable code like:
toIter([1, 2, 3, 4, 5, 6])
->filter(function($v) { return $v % 2 === 0; })
->slice(0, 1)
I think it can be implemented relatively simply:
class Test implements \IteratorAggregate {
private $iterator;
public function __construct(iterable $iterator)
{
$this->iterator = \iter\toIter($iterator);
}
public function getIterator()
{
return $this->iterator;
}
public function filter(\Closure $predicate) {
return new self(\iter\filter($predicate, $this));
}
}
What do you think?
I could implement this and if we alter toIter
to return the new type, it wouldn't even break backwards compatibility since people are expecting a generator.
Is there any reason for not passing both key and value to the mapKeys
and map
functions?
I find myself often wanting to use both even when just changing one.
Currently one option is to use enumerate
, but it makes the code a lot longer than it needs to be.
Hey there.
Flatten and flatMap reuse already yielded keys, which is perfectly fine as a traversable but swallows data when converting to array.
Demo code 1:
print_r(iterator_to_array(\iter\flatten([0, 1, [2, 3], 4, 5])));
Result 1:
Array
(
[0] => 2
[1] => 3
[3] => 4
[4] => 5
)
Demo code 2:
foreach (\iter\flatten([0, 1, [2, 3], 4, 5]) as $key => $value) {
print_r([$key => $value]);
}
Result 2:
Array
(
[0] => 0
)
Array
(
[1] => 1
)
Array
(
[0] => 2
)
Array
(
[1] => 3
)
Array
(
[3] => 4
)
Array
(
[4] => 5
)
Seems like
That seems to be true for both, yield $key => $value
as well as yield from $iterable
.
I'm not sure if that's to be considered a bug since "keep keys intact" could be a feature as well.
Regards,
Stephan.
Quite often when I'm in the middle of a pipeline of iterators, I would like to examine the current values coming through the pipeline without modifying them.
For example, I might want to log an intermediate result of the pipeline. One could abuse \iter\map to take items from the iterator, perform side effects, and return the items without modification; however, this is somewhat unwieldy:
$_ = \iter\map(doSomething(...), $queue);
$_ = \iter\map(
function ($item) use ($logger) {
$logger->info("did something with $item");
return $item;
},
$_
);
$_ = \iter\map(doSomethingWithPreviousValues(...), $_);
As such, I'd like to propose a new function \iter\tap()
which takes a callback and an iterable.
For each item in the iterable, it would call the callback with the item but return the item as passed to tap
.
$_ = \iter\map(doSomething(...), $queue);
$_ = \iter\tap(fn ($item): void => $logger->info("did something with $item"), $_);
$_ = \iter\map(doSomethingWithPreviousValues(...), $_);
This would be similar in spirit to RxJS's tap
function or the Java Stream.peek
method
See #95
Hi can we get a new release for the new additions of flatMap and flatten
@nikic could you perhaps tag a (stable) release?
Hi,
I think that working of function slice
is little broken right now.
Current implementation of slice
function breaks iteration after fetching item from iterator that is outside of maximum index. It causes lost of last iteration element.
Consider example below:
https://3v4l.org/6B9QS
test 1
is original function
test 2
is changed function to break immediately after yielding last element
When you have a function like that:
$i = function () {
for ($idx = 0; $idx < 1000; $idx ++) {
echo "preprocess {$idx} \n";
yield $idx;
}
};
and do slice($i(), 0, 3);
Do you expect to "preprocess 3" will be executed or not?
I made PR to illustrate change that fixes described behaviuor: #76
Regards
I've come across a rare case where the behavior of \iter\isEmpty()
can indirectly cause unexpected consumption from iterators in normal use cases.
Specifically, in the case where isEmpty
is provided an object of \IteratorAggregate
, the object's getIterator()
method is called. Any further uses of this object may call getIterator()
a second time, so if this method is not a pure function then the first item(s?) we might expect from the iterator could be lost to the void.
I've put together what I believe is a minimal reproduction case here:
https://gist.github.com/athrawes/e451484fdb4d0646f9aa03c9e47b566a
Basically, if the item passed to isEmpty
looks like this:
class MyClass implements \IteratorAggregate {
public function getIterator(): \Traversable {
// some logic which consumes some resource
// or otherwise mutates global mutable state
while ($item = array_pop($someGlobalState)) { yield $item; }
}
}
then the following does not behave as expected:
$iterable = new MyClass();
\iter\isEmpty($iterable); // <- initially appears to behave as expected, items have not technically been consumed yet
foreach ($iterable as $item) { ... } // <- missing item(s?) from the front as we got a 'new' iterable with said items missing
Then interacting with that item in the future with any foreach
constructs or any methods in this library will be missing some item(s?) from the front of the iterable.
I'm not sure if this is really a bug in this library per se, but it is rather unexpected - especially as the docblock on isEmpty
claims to not consume items from iterators. Would this simply be a matter of warning users about this edge case, or is there some way to handle this here?
The taken and slice variants don't consume the generator. I need the equivalent of array_splice or head in haskell. Take the head off the iterable, and consume it, so that when I use the iterable next, it doesn't have the head anymore. How can this be done?
The return type of IteratorAggregate::getIterator
is Traversable
, but toIter
assumes the value returned will be of type Iterator
. The return type check on toIter
therefore fails when getIterator
returns another IteratorAggregate
.
I don't know if it would be relevant or not. Some benchmark might be required to determine whether or not the original json_encode would fail on big arrays with big objects. If so using iterators could become useful. Comparing to the original one there would be the need for special const such as FORCE_OBJECT
(which already exists) and a FORCE_LIST
Thank you for this great library!
I have one use case that I am trying to accomplish with \map and \fn\index, example: https://gist.github.com/aonic/8290062
I came up with a solution here: https://github.com/aonic/iter/blob/master/src/iter.fn.php#L11
Do you think there is a better, existing way of doing this with the functions you've already built? I guess one obvious way is:
[15] boris> $hit = iter\map(iter\fn\index('hit'), $hits);
[16] boris> $names = iter\map(iter\fn\index('name'), $hit);
[17] boris> iter\toArray($names);
Is there any reason not to use iterator_to_array
instead of "manually" interating in the toArray
and toArrayWithKeys
implementations?
Possibly related to: https://bugs.php.net/bug.php?id=69599
(I'm assuming the nikic there is you)
I have a generator.
I can dump it using iterator_to_array()
:
array(1) { [2]=> string(6) "test123" }
When I chain it: iterator_to_array(\iter\chain(['a' => 'b'], $iter)
I get:
Segmentation fault (core dumped)
I can't reproduce this using simple arrays or generators.
I feel it is related to some kind of garbage collection:
The source generator has this code:
public function filter(Authorizable $source, iterable $targets, string $permission)
{
foreach($targets as $key => $target) {
if ($this->isAllowed($source, $target, $permission)) {
yield $key => $target;
$result[$key] = $target;
}
}
}
Note the $result
is unused; but by adding it intermittently works.
I'd be open to any help in reproducing this / helping debug this.
I could help with that, but I want to get confirmation first.
Relatively new to the library but enjoying it so far. However I've run into a chunking problem when working with heterogeneous pages of flattened data. Any time a chunk crosses an input data page boundary, items are lost because the integer keys overlap.
Here's a working example:
$data = [['a', 'b'], ['c', 'd', 'e'], ['f', 'g', 'h', 'i']];
$flat = \iter\flatMap(function ($x) { return $x; }, $data);
foreach (\iter\chunk($flat, 3, true) as $chunk) {
foreach ($chunk as $key => $val) {
print "$key: $val\n";
}
}
Output:
0: c
1: b
1: d
2: e
0: f
1: g
2: h
3: i
Notice that a
is missing and b
and c
are out of order. This is because both a
and c
share the same 0
key within the first chunk.
My workaround is to introduce a preserveKeys
argument on the chunk
function:
function chunk($iterable, $size, $preserveKeys = false) {
//snip
foreach ($iterable as $key => $value) {
if ($preserveKeys) {
$chunk[$key] = $value;
} else {
$chunk[] = $value;
}
//snip
}
//snip
}
Is there a better way to handle this?
Currently we have count()
to count the number of elements in an iterator.
I'd love a function empty()
that tells me wether an iterable is empty or not.
This could be done separately, alternatively it might be more flexible to add a limit to count
:
Many uses for count compare it with some number anyway, no sense wasting time counting beyond it.
function inf() {
$i = 0;
while(true) {
yield $i++;
}
}
function thousand() {
$i = 0;
while($i < 1000) {
yield $i++;
}
}
// Check if empty. [infinite loop]
if (count(inf()) > 0) {
}
// Check if contains at least 50 elements. [infinite loop]
if (count(inf()) > 50) {
}
// Check if contains at least 50 elements. [1000 iterations, 950 useless]
if (count(thousand()) > 50) {
}
// Check if empty. [1000 iterations, 999 useless]
if (count(empty()) > 0) {
}
Adding a limit to count
would resolve these issues:
function inf() {
$i = 0;
while(true) {
yield $i++;
}
}
function thousand() {
$i = 0;
while($i < 1000) {
yield $i++;
}
}
// Check if empty. [1 iterations]
if (count(inf(), 1) > 0) {
}
// Check if contains at least 50 elements. [50 iterations]
if (count(inf(), 51) > 50) {
}
// Check if contains at least 50 elements. [1000 iterations]
if (count(thousand(), 51) > 50) {
}
// Check if empty. [1 iteration]
if (count(thousand(), 1) > 0) {
}
Can we get a new release?
If you're interested i can create PR for automated releases.
We'd use semantic commit messages and fully automated released based on them.
We have a use case where we need to get a value by its index.
In that sense, I'd like to suggest introducing a function at/get(iterable $iterable, int $index) : ?int
that takes an iterable and returns the item at the specified index.
I'll be glad to provide a PR if welcome.
@nikic what do you think about adding unique
method to get iterator of unique values?
I think will be better if developers will be able to get a key.
Example:
use iter as f;
f\reduce(function ($acc, $value, $key) {
$acc .= sprintf("%s: %s\n", $key, $value);
return $acc;
}, $response->headers->all(), '');
Or I can do it otherwise?
I'd love to have some explode
-like iterable constructors.
Would they be a good fit for this library?
These are some example implementations. Note there's plenty of room for optimization.
explode(string $delimiter, string $string): iterable {
$buffer = '';
$delimiterLength = strlen($delimiter);
foreach(str_split($string) as $char) {
$buffer .= $char;
if (substr($buffer, -$delimiterLength) === $delimiter) {
yield $buffer;
$buffer = '';
}
}
yield $buffer;
}
explodeStream(string $delimiter, resource $stream): iterable {
return explodeCallback($delimiter, function() use ($stream) {
return fread($stream, 1);
}
}
explodeCallback(string $delimiter, \Closure $closure): iterable {
$buffer = '';
while (is_string($char = $closure())) {
$buffer .= $char;
if (substr($buffer, -$delimiterLength) === $delimiter) {
yield $buffer;
$buffer = '';
}
}
yield $buffer;
}
I was trying to use \iter\map
like this:
\iter\map([$this, 'aFunction'], $iterable)
But map
does not accept this kind of callable. Reading source shows this:
function map(callable $function, iterable $iterable): \Iterator {
foreach ($iterable as $key => $value) {
yield $key => $function($value);
}
}
Why $function(value)
and not call_user_func($function, $value)
? I don't see why, shed some light here please.
I'd like to release version 2.0 with PHP version requirement bumped to 7.1 and added type hints.
Are there any other (breaking) changes that should be done now?
The current implementation works with generalized iterators, which interferes with working with generators. Thus, a simple function does not "throw" out the interaction functional with the second part of the coroutine.
Current Implementation
function map(callable $function, iterable $iterable): \Iterator {
foreach ($iterable as $key => $value) {
yield $key => $function($value);
}
}
Usage Example
function source() {
echo yield 1;
}
$iterator = map(operator('*', 2), source());
while ($iterator->valid()) {
$iterator->send($iterator->current());
}
Actual
Displays the number 1
Expected
Displays the number 2
Offered
This problem affects almost the entire functionality, not only the map method.
No support for the ->send
coroutine method and return
operator.
It would be desirable to change behavior, having resulted it approximately to the following:
function toGenerator(iterable $iterable): \Generator {
yield from $iterable;
if ($iterable instanceof \Generator) {
return $iterable->getReturn();
}
}
function map(callable $function, iterable $iterable): \Iterator {
$generator = toGenerator($iterable);
while ($generator->valid()) {
[$key, $value] = [$generator->key(), $generator->current()];
$generator->send(yield $key => $function($value));
}
return $generator->getReturn();
}
function reverseMap(callable $function, iterable $iterable): \Iterator {
$generator = toGenerator($iterable);
while ($generator->valid()) {
[$key, $value] = [$generator->key(), $generator->current()];
$generator->send($function(yield $key => $value));
}
return $generator->getReturn();
}
I am wondering if this is the best place some functions which mirror functions provided by default in php which operate on arrays, for example, implode.
The primary reasons are:
iter
toArray
somewhat defeats the purpose of using generatorsWould you be interested in adding functions like this to iter
?
Here are some basic examples (not suggesting these are the best implementations):
function change_key_case($iterable, $case = CASE_LOWER)
{
if ($case === CASE_LOWER) {
foreach ($iterable as $key => $value) {
yield strtolower($key) => $value;
}
} elseif ($case === CASE_UPPER) {
foreach ($iterable as $key => $value) {
yield strtoupper($key) => $value;
}
} else {
throw new InvalidArgumentException('$case must be either CASE_LOWER or CASE_UPPER');
}
}
function chunk($iterable, $size, $preserve_keys = false)
{
$chunk = [];
foreach ($iterable as $key => $value) {
if ($preserve_keys) {
$chunk[$key] = $value;
} else {
$chunk[] = $value;
}
if (\count($chunk) === $size) {
yield $chunk;
$chunk = [];
}
}
}
function count_values($iterable)
{
$values = [];
foreach ($iterable as $value) {
if (empty($values[$value])) {
$values[$value] = 0;
}
$values[$value]++;
}
foreach ($values as $value => $count) {
yield $value => $count;
}
}
function flip($iterable)
{
foreach ($iterable as $key => $value) {
yield $value => $key;
}
}
function implode($glue, $iterable)
{
return substr(iter\reduce(function ($acc, $value) use ($glue) {
return $acc . $value . $glue;
}, $iterable, ''), 0, -strlen($glue));
}
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.