icamys / php-sitemap-generator Goto Github PK
View Code? Open in Web Editor NEWA simple PHP sitemap generator.
License: MIT License
A simple PHP sitemap generator.
License: MIT License
Hello,
I cloned this project and installed lib using composer install
command. But where I try to run .vendor/bin/phpunit
command it return below error stack. What does it mean.
.................................PHP Fatal error: Allowed memory size of 134217728 bytes exhausted (tried to allocate 54589360 bytes) in G:\work\php\php-sitemap-generator\src\SitemapGenerator.php on line 460
PHP Stack trace:
PHP 1. {main}() G:\work\php\php-sitemap-generator\vendor\phpunit\phpunit\phpunit:0
PHP 2. PHPUnit\TextUI\Command::main() G:\work\php\php-sitemap-generator\vendor\phpunit\phpunit\phpunit:61
PHP 3. PHPUnit\TextUI\Command->run() G:\work\php\php-sitemap-generator\vendor\phpunit\phpunit\src\TextUI\Command.php:159
PHP 4. PHPUnit\TextUI\TestRunner->doRun() G:\work\php\php-sitemap-generator\vendor\phpunit\phpunit\src\TextUI\Command.php:200
PHP 5. PHPUnit\Framework\TestSuite->run() G:\work\php\php-sitemap-generator\vendor\phpunit\phpunit\src\TextUI\TestRunner.php:621
PHP 6. PHPUnit\Framework\TestSuite->run() G:\work\php\php-sitemap-generator\vendor\phpunit\phpunit\src\Framework\TestSuite.php:597
PHP 7. Icamys\SitemapGenerator\SitemapGeneratorTest->run() G:\work\php\php-sitemap-generator\vendor\phpunit\phpunit\src\Framework\TestSuite.php:597
PHP 8. PHPUnit\Framework\TestResult->run() G:\work\php\php-sitemap-generator\vendor\phpunit\phpunit\src\Framework\TestCase.php:756
PHP 9. Icamys\SitemapGenerator\SitemapGeneratorTest->runBare() G:\work\php\php-sitemap-generator\vendor\phpunit\phpunit\src\Framework\TestResult.php:691
PHP 10. Icamys\SitemapGenerator\SitemapGeneratorTest->runTest() G:\work\php\php-sitemap-generator\vendor\phpunit\phpunit\src\Framework\TestCase.php:1028
PHP 11. Icamys\SitemapGenerator\SitemapGeneratorTest->testCreateTooLargeSitemap() G:\work\php\php-sitemap-generator\vendor\phpunit\phpunit\src\Framework\TestCase.php:1408
PHP 12. Icamys\SitemapGenerator\SitemapGenerator->createSitemap() G:\work\php\php-sitemap-generator\test\SitemapGeneratorTest.php:438
PHP 13. SimpleXMLElement->asXML() G:\work\php\php-sitemap-generator\src\SitemapGenerator.php:460
For the lastmod element to be useful, first it needs to be in a supported date format (which is documented on sitemaps.org); Search Console will tell you if it's not once you submit your sitemap. Second, it needs to consistently match reality: if your page changed 7 years ago, but you're telling us in the lastmod element that it changed yesterday, eventually we're not going to believe you anymore when it comes to the last modified date of your pages.
https://developers.google.com/search/docs/crawling-indexing/sitemaps/overview
I actually passed null and that removed lastmod from the generated sitemap. Maybe good to mention that somewhere.
$generator->addURL('/blauwalg-radar', null, 'daily', $priority+0.5);
<url>
<loc>https://zwemindex.nl/blauwalg-radar</loc>
<changefreq>daily</changefreq>
<priority>1.0</priority>
</url>
Is there a way of submitting an existing(just pass in filename or content as string) sitemap?
If so, please provide an example.
I have tried with [Codeigniter4] (https://codeigniter.com/), but it is not working No error, no sitemap files in codeigniter4 public folder
Google supports this syntax to submit alternative links for other languages: https://webmasters.googleblog.com/2012/05/multilingual-and-multinational-site.html
Is it possible to generate a sitemap containing alternative links with this library?
Hi,
I saw that ask.com was already removed. You can also remove Bing Pin, because it's also already not available anymore.
You can test it: https://www.bing.com/ping (HTTP 410) and find also information about this here: https://blogs.bing.com/webmaster/may-2022/Spring-cleaning-Removed-Bing-anonymous-sitemap-submission
Thx,
Tobi
I've installed the crawler, and it generates the files, creates the compressed file and writes my base paths down all gucci there.
but it does not crawl the directories.
Hello, I can't find the option to change the name of individual sitemaps and then have everything join the main sitemap.
I need a structure like this:
sitemap-post-1.xml
sitemap-post-2.xml
sitemap-tag-1.xml
sitemap-tag-2.xml
sitemap-category-1.xml
sitemap-category-2.xml
sitemap-index.xml
I try changing the name with the 'get Sitemap Filename' function but it only takes the final name and puts them all like this.
My code is:
$generator = new SitemapGenerator($siteUrl, $outputDir);
$alternates = [
['hreflang' => 'de', 'href' => "http://www.example.com/de"],
['hreflang' => 'fr', 'href' => "http://www.example.com/fr"],
];
$datetimeStr = '2020-12-29T08:46:55+00:00';
$lastmod = new DateTime('2020-12-29T08:46:55+00:00');
$generator->setMaxUrlsPerSitemap(8);
$generator->setSitemapFilename('post-category.xml'); // Sitemap name1
for ($i = 0; $i < 20; $i++) {
$generator->addURL("/path/to/page-$i/", $lastmod, 'always', 0.5, $alternates);
}
$generator->flush();
$generator->setMaxUrlsPerSitemap(5);
$generator->setSitemapFilename('post-tag.xml'); // Sitemap name2
for ($i = 0; $i < 20; $i++) {
$generator->addURL("/path/to/page2-$i/", $lastmod, 'always', 0.5, $alternates);
}
$generator->flush();
$generator->finalize(); // Generate Index with all sitemaps-
How can I get the result I need?
Hello,
I am using your sitemap generator for project I am working on and I have an issue with alternate URLs.
When I create sitemap with alternate URLs, in browser it shows only the first one, and the alternate is visible only in code/inspector.
Is there something I can do about that?
Attached screenshot of browser.
Thank you,
Alena Martinkova
I ran the generator twice on different dates and got the invalid sitemap index. Then I found it appends the XML content:
<sitemapindex xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.sitemaps.org/schemas/sitemap/0.9 http://www.sitemaps.org/schemas/sitemap/0.9/sitemap.xsd">
<sitemap>
<loc>https://xxxx.com/sitemap1.xml.gz</loc>
<lastmod>2022-01-12T22:57:28-05:00</lastmod>
</sitemap>
</sitemapindex>
<sitemapindex xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.sitemaps.org/schemas/sitemap/0.9 http://www.sitemaps.org/schemas/sitemap/0.9/sitemap.xsd">
<sitemap>
<loc>https://xxxx.com/sitemap1.xml.gz</loc>
<lastmod>2022-01-06T12:37:21-05:00</lastmod>
</sitemap>
</sitemapindex>
It should overwrite the content in sitemap-index.xml and sitemap1.xml.gz, .... right?
As per google document. https://support.google.com/webmasters/answer/189077
Problem: Multiple Language tags
Expected: xhtml:link/
Current:
According to these statements from Google, uploading the sitemap through the ping url provided by Google will no longer be allowed. Currently they are doing it based on a Google console API. It would be good if the creator of this repository had an API APP in Google Console, and that allowed uploading sitemaps based on individual or portioned API. This is done by the Rank Math WordPress plugin, it has a script made within the Google Console, for those who want to upload the urls that way within the plugin and for free. Please review this topic.
https://developers.google.com/search/blog/2023/06/sitemaps-lastmod-ping
Hi, FileSystem object is not available for me on PHP 7.4.
Please check it.
Hi,
When using sitemap behind a proxy server, I get a long timeout generating sitemap even if proxy server is set on the cms.
Is there special properties to set ?
Thx
As you can see from here, the location is not correct.
https://techindeep.com/sitemap/sitemap-index.xml
Should have been like https://www.techindeep.com/sitemap/sitemap1.xml.gz not https://www.techindeep.com/sitemap1.xml.gz
Hi,
I've added a PR to allow the selected stylesheet to be applied to the sitemap index in addition to the sitemap itself. (it' just a 4-line cut/paste) Unless there's a reason not to apply the stylesheet to indexes (?) this would seem to make sense as it makes chained sheets more consistent.
#66
Thank you for creating a very useful generator.
I'm using the latest release and it appears there are two bugs in the updateRobots()
method when the total number of URLs is less than MAX_URLS_PER_SITEMAP
(therefore making the sitemap index not required). The following Sitemap records are written to robots.txt in these scenarios...
When MAX_URLS_PER_SITEMAP
is set to a number less than the total number of URLs (forcing the sitemap index to be required):
Sitemap: https://www.mydomain.com/sitemap-index.xml
This is the correct behaviour, I believe.
Without GZ compression turned on:
Sitemap: https://www.mydomain.com/sitemap.xml
Sitemap: https://www.mydomain.com/sitemap.xml
The line should only be written once.
With GZ compression turned on:
Sitemap: https://www.mydomain.com/sitemap.xml.gz
Sitemap: https://www.mydomain.com/sitemap.xml.gz.gz
There should be two lines (one for .xml and one for .gz, but they are named incorrectly (.gz is added to the filename twice).
I believe the offending code is on lines 488-490, and 494-496:
if (!isset($this->sitemapIndex)) {
$robotsFileContent .= "\nSitemap: " . $this->getSitemapFileName($this->sitemapFullURL);
}
...because sitemapIndex
is never set if the number of URLs is less than MAX_URLS_PER_SITEMAP
.
I hope this makes sense, but if not please feel free to ask me for clarification.
your dependencies does not support php 8.0
Problem 1
- Root composer.json requires astrotomic/php-open-graph ^0.5.0 -> satisfiable by astrotomic/php-open-graph[0.5.0].
- astrotomic/php-open-graph 0.5.0 requires php ^7.1 -> your php version (8.0.0) does not satisfy that requirement.
Problem 2
- icamys/php-sitemap-generator[2.0.0, ..., 2.0.3] require php ^7.2 -> your php version (8.0.0) does not satisfy that requirement.
- Root composer.json requires icamys/php-sitemap-generator ^2 -> satisfiable by icamys/php-sitemap-generator[2.0.0, 2.0.1, 2.0.3].
Hi , Just figured out on my project that I have an error with this library. The main cause is that http://search.yahooapis.com/SiteExplorerService/V1/ping?sitemap=
is no more active , and it has been replaced with http://www.bing.com/ping?sitemap=
that is already into the searchEngines. Could we take the Yahoo one off then ?
Please contact me [email protected]. i want seo of my site.
An example of my code
$yourSiteUrl = 'https://optklin.com.ua'; $outputDir = getcwd(); $generator = new \Icamys\SitemapGenerator\SitemapGenerator($yourSiteUrl, $outputDir); $generator->toggleGZipFileCreation(); $generator->setMaxURLsPerSitemap(50000); $generator->setSitemapFileName('/frontend/web/sitemap.xml'); $generator->addURL('/', new \DateTime(), 'always', 0.5, []); $generator->createSitemap(); $generator->writeSitemap();
It only creates such a file
`
https://optklin.com.ua/ 2020-12-08T10:58:14+00:00 always 0.5 `I added
generator->addURL($loc, $lastModified, 'monthly', 0.9);
but it generates:
<url>
<loc>https://test-web.com</loc>
<lastmod>2020-08-25T08:09:12-03:00</lastmod>
<changefreq>monthly</changefreq>
<priority>0,9</priority>
</url>
and that priority violates the rules for for float numbers
(note that my php is configured for south america localizations and numbers)
Hi, thanks for the library, works great, but I have small issue:
Even though my site base url is "site.com" and every link inside of sitemaps generated correctly but in the sitemap-index.xml links to sitemaps are broken because my "sitemap base url" is "site.com/api/sitemap"
$siteUrl = 'https://site.com';
$outputDir = __DIR__ , '/../../var/sitemap';
$generator = new SitemapGenerator($siteUrl, $outputDir);
// Add 50000+ urls
// $generator->addURL('/' . $url['path'], $url['updated_at'], $url['change_freq'], 0.5, $url['alternates']);
// $generator->flush();
// Workaround:
$ref = new \ReflectionObject($generator);
$baseUrlProperty = $ref->getProperty('baseURL');
$baseUrlProperty->setAccessible(true);
$baseUrlProperty->setValue($generator, "{$siteUrl}/api/v1/sitemap"); // Fix is here
$baseUrlProperty->setAccessible(false);
$generator->finalize();
It generates exactly what I need:
In index:
<sitemap>
<loc>https://site.com/api/v1/sitemap/sitemap1.xml</loc>
<lastmod>2021-02-02T16:15:58+00:00</lastmod>
</sitemap>
While in sitemap1.xml
<url>
<loc>http://site.com/blog</loc>
...
So would be nice to have a option to change sitemap base url in case when sitemap located not in the root of website
I found the license as MIT in composer.json
could a license file also be added to the root of the project?
The SubmitsiteMap method does not take into account the response FALSE when executing CURL request
PHP Error[8]: Trying to access array offset on value of type bool in file /vendor/icamys/php-sitemap-generator/src/SitemapGenerator.php at line 634
I'm using the exact code on readme file. apparently gzip compression happens 2 times.
i downloaded one of the sitemap files extracted once, then changed the xml into gz extension and extracted it again to see the content.
the compressed file and xml file size are almost the same.
no issues when createGZipFile=false
if (function_exists("gzwrite")) {
echo "OK";
} else {
die("zlib missing");
}
include "src/SitemapGenerator.php";
$generator = new \Icamys\SitemapGenerator\SitemapGenerator('https://mysite.ir');
$generator->createGZipFile = true;
$generator->maxURLsPerSitemap = 25000;
$generator->sitemapFileName = "sitemap.xml";
$generator->sitemapIndexFileName = "sitemap-index.xml";
$products = R::find( 'products' , 'limit 30000'); foreach($products as $product){
$generator->addUrl('/'.$product->id . '/' . strtolower(str_replace(" ","-",$product->fa_title)), new DateTime(), 'always', '0.8' );
}
$generator->createSitemap();
$generator->writeSitemap();
Hi there, im testing your script.
Does it crawl the whole site or just the specified urls?
Thanks
Hello, it seems that ask.com has changed thei submission link, because at the moment php-sitemap-generator fails with the error
PHP Fatal error: Uncaught RuntimeException: failed to run curl_exec, error: Could not resolve host: submissions.ask.com in vendor\icamys\php-sitemap-generator\src\SitemapGenerator.php:638
and the domain is not found : http://submissions.ask.com/
Is it possible to remove it from the code ? Thanks
Having this code:
$config = new Config();
$config->setBaseURL( $host );
$config->setSaveDirectory( static::SITEMAP_DIR );
$generator = new SitemapGenerator($config);
$generator->setMaxUrlsPerSitemap( 50000 );
$generator->setSitemapIndexFileName( 'sitemap.xml' );
$generator->setSitemapFileName( 'sitemap-item-.xml' );
foreach( range( 0, 9999 ) as $i ) {
$generator->addURL( "/url/{$i}" );
}
$generator->flush();
$generator->finalize();
The generator will only create a single file named sitemap-item-.xml
.
This is unexpected (at least, for me?), and I'd want it to create a sitemap-item-1.xml
and sitemap.xml
as sitemap index.
It works as I want if I change MaxUrlsPerSitemap
to be lower than urls count:
// will create three files: sitemap.xml as sitemap index, sitemap-item-1.xml and sitemap-item-2.xml as sitemaps.
$generator->setMaxUrlsPerSitemap( 9999 );
Is there any setting I've missed or this is a bug?
Hi,
As per Google specification for image sitemaps, it is possible to have multiple images (up to 1000) for a single page.
You may find the example below:
<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"
xmlns:image="http://www.google.com/schemas/sitemap-image/1.1">
<url>
<loc>http://example.com/sample1.html</loc>
<image:image>
<image:loc>http://example.com/image.jpg</image:loc>
</image:image>
<image:image>
<image:loc>http://example.com/photo.jpg</image:loc>
</image:image>
</url>
<url>
<loc>http://example.com/sample2.html</loc>
<image:image>
<image:loc>http://example.com/picture.jpg</image:loc>
</image:image>
</url>
</urlset>
The current implementation assumes passing a single 'google_image' key into the extension parameter, not allowing to pass an array of multiple 'google_image' elements, making it impossible to have multiple images added for a single path.
The proposed solution would be, in addition to accepting a flattened array (preserving backward compatibility), to allow passing an array of multiple image/video elements. Please refer to example below:
$extensions = [
['google_image' => ['loc' => 'https://www.example.com/thumbs/1.jpg']],
['google_image' => ['loc' => 'https://www.example.com/thumbs/2.jpg']],
['google_image' => ['loc' => 'https://www.example.com/thumbs/3.jpg']]
];
Thanks,
Ilya
It would be nice to support video sitemaps (https://support.google.com/webmasters/answer/80471). Happy to take a stab at this.
Would just adding a new parameter to addURL
be acceptable?
Getting errors from google and Yandex when there is more than one generated XML file.
I guess it is connected with tag. When there are multiple XMLs urlset closed in the next XML and right after that new XML structure with content is created -
</urlset>
<?xml version="1.0" encoding="UTF-8"?>
<!--generator-class="Icamys\SitemapGenerator\SitemapGenerator"-->
<!--generator-version="4.3.2"-->
<!--generated-on="2021-08-01T13:52:59+02:00"-->
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" xmlns:xhtml="http://www.w3.org/1999/xhtml" xmlns:video="http://www.google.com/schemas/sitemap-video/1.1" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.sitemaps.org/schemas/sitemap/0.9 http://www.sitemaps.org/schemas/sitemap/0.9/sitemap.xsd">
<URL>
and then other content till the end without closing tag - </urlset>
Notice: Trying to access array offset on value of type bool in /webser/verlibs/icamys/php-sitemap-generator/src/SitemapGenerator.php on line 647
Fatal error: Uncaught RuntimeException: failed to run curl_exec, error: Could not resolve host: submissions.ask.com in /webserver/libs/icamys/php-sitemap-generator/src/SitemapGenerator.php:638 Stack trace: #0 /webserver/sitemap-php.php(99): Icamys\SitemapGenerator\SitemapGenerator->submitSitemap() #1 {main} thrown in /webserver/libs/icamys/php-sitemap-generator/src/SitemapGenerator.php on line 638
I trying without composer also.
So I see in your example, you have a call to ->addUrl
like this
$generator->addUrl('http://example.com/url/path/', date('c'), 'always', '0.5');
but when I use date('c')
as a the lastmodified date I keep on getting a
Argument 2 passed to Icamys\SitemapGenerator\SitemapGenerator::addUrl() must be an instance of DateTime, string given
This is because date('c')
resolves to a string. It took me quite a while to realize this and I feel like it shouldn't be displayed like that in the example.
Currently the generator outputs <link rel="alternate" hreflang="en" href="...
This does work for google search, but does not work for Yandex. In Yandex I get tons of errors because of this.
Searching for this I found that link must be <xhtml:link rel="alternate" hreflang="de" href=
for more details please see google instructions https://developers.google.com/search/docs/advanced/crawling/localized-versions
In PHP 8.0.11 it gives the following error.
Fatal error: Uncaught Error: Class "Icamys\SitemapGenerator\FileSystem" not found in /var/www/vhosts/example.com/httpdocs/class/class.sitemap.generator.php:224 Stack trace: #0 /var/www/vhosts/example.com/httpdocs/cron.sitemap.generator.php(12): Icamys\SitemapGenerator\SitemapGenerator->__construct() #1 {main} thrown in /var/www/vhosts/example.com/httpdocs/class/class.sitemap.generator.php on line 224
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.