Giter VIP home page Giter VIP logo

php-simple-html-dom-parser's People

Contributors

daigo75 avatar mlocati avatar mryanb avatar pherserk avatar shtse8 avatar sunra avatar wanderingzombie avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

php-simple-html-dom-parser's Issues

$html->load_file shows errors if a page doesn't exist.

I have a trouble. Using the $html->load_file method, it shows errors if a page doesn't exist.
The error says: 'Warning: file_get_contents(http://auto.desko.kg/car/24779): failed to open stream: HTTP request failed! HTTP/1.1 404 Not Found in C:\xampp\htdocs\deskoparse\simple_html_dom.php on line 1080'. Is it possible to add checks to the parser so it could find out if such a page exists, and also why that mthod doesn't return 'True' if a page exists?

tag attributes missing

Here is my testing code:

$CC = <<<EOF
<p style="max-width: 100%; min-height: 1em; white-space: pre-wrap; color: rgb(62, 62, 62); text-align: center; font-family: 微软雅黑; font-size: 14px; line-height: 24px; box-sizing: border-box !important; word-wrap: break-word !important; background-color: rgb(255, 255, 255);"><img img_width="500" img_height="398" data-type="jpeg" data-ratio="NaN" data-w="0" width="auto" width="auto" data-src="http://mmbiz.qpic.cn/mmbiz/fZ6yVsBCVhLQdrDUBay4Ps1qhhKGiadibMIdicxOXx74cXsIVxk0Emib1XpZxHUXLuToWEMibPRr0I8noqtuWZfowNg/640?wx_fmt=jpeg"/></p>
EOF;

//well, load the class as u often do
//Loader::import('SimpleHtmlDom', 'html');
$DOM    = str_get_html($CC);
if ( $DOM == false )
{
    return false;
}

echo $DOM->innertext;

and output is:

<p style="max-width: 100%; min-height: 1em; white-space: pre-wrap; color: rgb(62, 62, 62); text-align: center; font-family: 微软雅黑; font-size: 14px; line-height: 24px; box-sizing: border-box !important; word-wrap: break-word !important; background-color: rgb(255, 255, 255);"><img img_width="500" img_height="398" data-type="jpeg" data-ratio="NaN" data-w="0" width="auto"></p>

well, something is missing.

if i comment the code fragment bettween line 1488 to 1491 and i would got what i want:

if (isset($node->attr[$name]))
{
    return;
}

it maybe a bug!

No removeChild() function

As far as I can determine, Simple HTML DOM does not have a way to actually remove DOM elements from a document. This can be troublesome, especially if you're using mpdf to make a PDF file and there's an <svg> tag in there; mpdf flips out whenever it sees one.

There may be a good reason removeChild() has not been implemented, but as a suggestion for a future update, could such a function be implemented?

Can't find simple things

The parser won't find "<p class="body"" in this line:
<script id="forecast-summary-0" type="text/x-jquery-tmpl"> <div id="forecast-summary" class="summary-column"> <h3>Forecast Summary</h3> <div class="forecast-summary" lang="en-GB"> <ul > <li> <h4 class="title">This Evening and Tonight</h4> <p class="body">Fairly cloudy this evening with scattered heavy showers, which gradually ease through the evening. However cloud thickeing overnight to bring periods of occasionally heavy rain before dawn as southeast winds increase strong to near gale.</p> </li> </ul> </div> </div> </script> (all one line). I Use "$body = $body[0]->find('p[body]');" to find it but it returns no results. Is there something I've missed, can you help???

Cannot find tags that have additional classes

find method cannot find tag that has additional classes.

For example, I want to find all tags that have 'services' class:

<div class='services'> or
<div class='services last-item'> or
<div class='services active'>

But, If I run:

$html->find('div[class=services]'); 

I will only get one result:

<div class='services'>

Invalid character class

This pattern:

([\w-:*])(?:#([\w-]+)|.([\w-]+))?(?:[@?(!?[\w-:]+)(?:([!^$]?=)["']?(.*?)["']?)?])?([/, ]+)
Treats the - in both the character groups as ranges rather than characters to match meaning that the regex is looking for everything including and between \w-: rather than the three characters by themselves. The same issue is repeated near the middle of the regex.

See pr #70

what if i'm not sure the html contains h1, h2 ,h3 or h4

is there a way to avoid something like this???

$dom = HtmlDomParser::str_get_html($html_str);

if($dom->find('h1', 0))
    return $dom->find('h1', 0)->plaintext;
if($dom->find('h2', 0))
    return $dom->find('h2', 0)->plaintext;
if($dom->find('h3', 0))
    return $dom->find('h3', 0)->plaintext;
if($dom->find('h4', 0))
    return $dom->find('h4', 0)->plaintext;

php-simple-html-dom-parser for PHP7.3 +

If you use php 7.3 and higher, then use my edits. Otherwise, you will get errors due to migration to PCRE2 in new versions of PHP.

For example: Warning: preg_match_all (): Compilation failed: invalid range in character class at offset 4

New file

curl request works only in local environment

Hello, I'm using the latest version of the parser <1.8.1> downloaded from the official sourceforge page. When I use the function file_get_html(

) to pull a webpage from a remote host, I'm getting a warning that the request has timed out <at line 136>, though the warning/error occurs only when it's made from a remote host/environment - it works perfectly fine when made from my local server.

Edit: That's the whole code on github - here

Additional edit: You can experience the warning/error in the integrated github environment or at my remote server...

Endless __destruct() loop when served via Laravels php artisan serve command

When I use your library together with Laravel and take advantage of Laravel's possibility to start local development server using command php artisan serve I run into an issue where Laravel server gets stuck in an endless loop of calls to simple_html_dom_node->__destruct(). After maximum execution time is exceeded, Laravel server calls:

Laravel development server started: <http://127.0.0.1:8000>
[Thu Jun 14 09:24:43 2018] PHP Fatal error:  Maximum execution time of 60 seconds exceeded in C:\Users\[REDACTED]\Desktop\Tests\PHP\blog\blog\vendor\sunra\php-simple-html-dom-parser\Src\Sunra\PhpSimple\simplehtmldom_1_5\simple_html_dom.php on line 140
[Thu Jun 14 09:24:43 2018] PHP Stack trace:
[Thu Jun 14 09:24:43 2018] PHP   1. simplehtmldom_1_5\simple_html_dom_node->__destruct() C:\Users\[REDACTED]\Desktop\Tests\PHP\blog\blog\vendor\sunra\php-simple-html-dom-parser\Src\Sunra\PhpSimple\simplehtmldom_1_5\simple_html_dom.php:0

I debugged the issue for a while but could not resolve it by any other way than to delete/rename/comment-out your destructors in mentioned class.

Minimum, Complete and Verifiable example/Steps to reproduce:

  1. Create Laravel project by running composer create-project --prefer-dist laravel/laravel blog
  2. cd blog and update composer.json with required dependency to your library "sunra/php-simple-html-dom-parser": "^1.5"
  3. Run composer update to fetch newly added dependency.
  4. Open file routes/web.php and update its contents to contain following
use Sunra\PhpSimple\HtmlDomParser;
Route::get('/',  function(){
    $input = <<<EOM
    <!-- PUT YOUR NON-TRIVIAL HTML MARKUP HERE -->                    
EOM;
    $parser = new HtmlDomParser();
    $dom = $parser->str_get_html($input);


    return view('welcome');
});

(1) Note that <!-- YOUR NON-TRIVIAL HTML MARKUP HERE --> should really be replaced with non-trivial markup, e.g. google.com's source from front-page.
5. Start local development server php artisan serve and access used address (it defaults to 127.0.0.1:8000)

(2) Note I was not able to reproduce it using on PHPv7.1.14 or PHPv7.1, but PHPv7.1.13, PHPv7.1.18 and even PHPv7.2 do suffer from this behavior.

I worked-around this issue by setting up composer script on post-autoload-dump event where I search and destroy (rename) your destructors.

remove_noise breaking fields

PROBLEM
When parsing a document having: <input name="me" value="my { dog is nice"> the document is parsed in an invalid way. The value property for $input in

   foreach($this->html->find('input[name='me']') as $input)

is "my {dog is nice" plus all remaining HTML, instead of "my {dog is nice".

WORKAROUND
I commented $this->remove_noise("'({\w)(.*?)(})'s", true); in the load method, but I guess an improvement in remove_noise in order to be aware of quotes would be a better solution.

Regards, Pablo.

Blank TD returning next result, not the blank per se

Hi sunra, I am having an issue using Simple HTML DOM Parser. Have used it several times before but until now I came across this issue:

When searching for TDs, when there is a blank TD (with   or no content) I get as a result the next TDs.

I have found also that someone reported the same on Stackoverflow: http://stackoverflow.com/questions/11123267/simple-html-dom-parser-return-empty-td-with-all-tds-values

Example as a result of var_dumping $html->find('td'); (element 2 should be blank!):

0
12/02/2014 09:14 AM
1
MEXICO D.F. En proceso de entrega MEX MEXICO D.F.
2

12/02/2014 08:27 AM
MEXICO D.F. Llegada a centro de distribucion
Envio en proceso de entrega

Please allow to set MAX_FILE_SIZE from outside

I'm having some trouble trying to parse documents > MAX_FILE_SIZE. Since this is a constant, I can't redefine this in a clean way. I think you could define this as a public static var in class simple_html_dom_node and use it from there.

Maintained Fork of PHP Simple HTML DOM Parser (PHP 7.X)

https://github.com/voku/simple_html_dom


A HTML DOM parser written in PHP - let you manipulate HTML in a very easy way! This is a fork of PHP Simple HTML DOM Parser project but instead of string manipulation we use DOMDocument and modern php classes like "Symfony CssSelector".

  • PHP 7.0+ Support
  • PHP-FIG Standard
  • Composer & PSR-4 support
  • PHPUnit testing via Travis CI
  • PHP-Quality testing via SensioLabsInsight
  • UTF-8 Support (more support via "voku/portable-utf8")
  • Invalid HTML Support (partly ...)
  • Find tags on an HTML page with selectors just like jQuery
  • Extract contents from HTML in a single line

file_get_contents &amp;

on $contents = file_get_contents($url, $use_include_path, $context, $offset = 0) put an & on character &
it happen just when i change my hosting provider. in other hosting provider works great

Curly brackets cause unexpected behavior

this is a simple PHP script

<?php
$html = HtmlDomParser::str_get_html('<html><body><span>a</span><span>{b</span><span>c}</span><span>d</span></body></html>');

foreach ($html->find('span') as $v) {
    echo $v->innertext."\n";
}
?>

I expected follwings:

A
{b
c}
d

But result is follwings:

a
{b</span><span>c}
d

Fatal error: Maximum execution time of 60 seconds exceeded in \Sunra\PhpSimple\simplehtmldom_1_5\simple_html_dom.php on line 151

My class like this

public function LayHinhTuDong9GagAction($id="aP9QwYV%2CaRjvbnq%2CaOB9wDy")
    {
        $client = new \GuzzleHttp\Client();
        $res = $client->request('GET', 'https://9gag.com/?id='.$id.'&c=10',
            [
                'headers' => [
                    'referer'=>'https://9gag.com/',
                    'x-requested-with'=>'XMLHttpRequest',
                    'method'=>'GET',
                    'authority'=>'9gag.com',
                    'path'=>'/?id=aP9QwYV%2CaRjvbnq%2CaOB9wDy&c=10',
                    'scheme'=>'https',
                    'accept'=>'application/json, text/javascript, */*; q=0.01',
                    'accept-encoding'=>'gzip, deflate, br',
                    'user-agent'=>'Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/61.0.3163.100 Safari/537.36'
                    ]
            ]
            );
        $chuoi_dulieu= $res->getBody();
        $stringBody = (string) $chuoi_dulieu;
        $stringBody=\GuzzleHttp\json_decode($stringBody);
        $array_hinh=$stringBody->items;
        echo count($array_hinh);
        var_dump($array_hinh);
        foreach($array_hinh as $key=>$hinh){
            $la_video=0;
            $dom = HtmlDomParser::str_get_html( $hinh );
            $tua_de=$dom->find("h2",0)->plaintext;
            $elems = $dom->find("source");
            if(empty($elems))// xử lý khi là hình gif mp4 video
            {
                $elems_2 = $dom->find("div[class=badge-video-container]");
                if(!empty($elems_2)){
                  if($elems_2[0]->{'data-video-source'}=="YouTube"){
                      echo "YouTube";
                      continue;
                  }
                }
            }else
            {
                $link_video=$elems[0]->src;;
                $link_hinh=preg_replace('/460(\w*)/', "700b", $elems[0]->src);
                $link_hinh=str_replace("mp4","jpg",$link_hinh);
                $slug_id=$this->LuuVideo($link_video,$link_hinh);
                if(!$slug_id)
                {
                    echo "loi";exit;
                }
                $la_video=1;
//                continue;// tiep tuc vong lap bo qua cac ham phia sau
            }
            if($la_video!=1)
            {
            $elems = $dom->find("img");//tim hinh anh khong phai la video
            $link_hinh=preg_replace('/460(\w*)/', "700b", $elems[0]->src);
            $slug_id=$this->LuuAnh($link_hinh);
                if(!$slug_id)
                {
                    echo "loi";exit;
                }
            }
            echo $link_hinh.'<br>';
            $post=new \PostsCollection();
            $post->tua_de=$tua_de;
            $post->link_hinh="/photo/".$slug_id.'.jpg';
            $post->link_goc=$link_hinh;
            $post->save();
//            $dom->clear();
//            unset($dom);
            $this->view->pick("LayHinhTuDong/index");

        }

Check if parent tag has id, attribute or class.

Hello,

I am looping through a HTML string as follows:

	foreach ( $dom->find( 'text' ) as $element ) {
		
		if ( !in_array( $element->parent()->tag, $excludedParents ) ) {				
			$element->innertext = preg_replace(
				'/(?<!\w)' . preg_quote( $search, "/" ) . '(?!\w)/i',
				$replace,
				$element->innertext
			);
		}
	}

This works fine for excluded parents like a, div or em, but not for a.test or div#test. Is there an elegant way to solve that?

innerText() not returning valid utf-8 string

When you attempt to get the text of an element that has no html elements in it it returns a non-utf-8 encoded string. An element such as

<h3>Технические работы на сервере<h3>

the string returned by innerText() is not encoded properly but the string returned by outerText() is returned with the proper encoding. This refers to the simple_html_dom_node class.

Text subnodes

Is it possible to extract contents of first text node?
I.e. string Hello in subtree

<div>
    Hello
    <strong>World!</strong>
</div>

Make composer.json valid

By running composer validate
we get following.

"./composer.json" does not match the expected JSON schema:
   - authors[0].name : The property name is required

Note: I have the patch file but do not have access push to repository to create pull request

Is there any way to find if an element has a certain child or not?

So, the page that I am parsing. It has a td tag within that in some cases it has a tag and in some cases it doesn't have.
However, i have tried this $row->find('td', 2)->find('a', 0) and it says can't find value on null.
Is there anyway to find the child exists or not?

One way that I have found is count($row->find('td', 2)->find('a', 0)) and if it returns 1 basically there's a child and otherwise none.
Is there any other way to find it?
Thanks in advance.

Version 1.5.2 Sunra Uncaught error: Call Undefine Function file_get_html()

After upgrading to V1.5.2 it always shows error to this function file_get_html()

sunra

Currently the one we are using is v1.5.0 but after updating it now, it shows this error:

Your requirement could not be resolved to an installation set of packages.

Problem 1

  • The requested package sunra/php-simple-html-dom-parser 1.5.0 exists as sunra/php-simple-html-dom-parser[dev-master, v1.5.1, v1.5.2] but these are rejected by your constraints

How to get value from dom?

I have the following table - only 2 rows shown for brevity. How do I traverse the table to extract the price class value for Catalog ID 100245 i.e. H1?

 <tbody>
      <tr class="catalog_line">
         <td class="properties">
            <div class="grid-prop">
               <span class="label nom">Catalog ID</span>
               <span class="catdata1 cdatamarker">100245</span>
            </div>
            <div class="grid-prop nom">
                <span class="label">Product, price class</span>
                <span class="catdata1">
                  <span class="category">Cars</span>
                  , H1
                </span>
            </div>
        </td>
      </tr>
      <tr class="catalog_line">
         <td class="properties">
            <div class="grid-prop">
               <span class="label nom">Catalog ID</span>
               <span class="catdata1 cdatamarker">100246</span>
            </div>
            <div class="grid-prop nom">
                <span class="label">Product, price class</span>
                <span class="catdata1">
                  <span class="category">Cars</span>
                  , H1
                </span>
            </div>
        </td>
      </tr>
  <tbody>

simple_html_dom_node::text() method is abnormal with embed tag

Example code

$content = "<div class=test><embed src='http://....swf' quality='high' width='480' height='400' align='middle' allowScriptAccess='always' allowFullScreen='true' mode='transparent' type='application/x-shockwave-flash'></embed></div>";
$dom = HtmlDomParser::str_get_html($content);
$newsContent = $dom->find(".test", 0)->text();
var_dump(newsContent);

Expected result:
string(0) ""
Result:
string(8) "</embed>"

php 7.1 fix

//function file_get_html($url, $use_include_path = false, $context=null, $offset = -1, $maxLen=-1, $lowercase = true, $forceTagsClosed=true, $target_charset = DEFAULT_TARGET_CHARSET, $stripRN=true, $defaultBRText=DEFAULT_BR_TEXT, $defaultSpanText=DEFAULT_SPAN_TEXT)
function file_get_html($url, $use_include_path = false, $context=null, $offset = 0, $maxLen=-1, $lowercase = true, $forceTagsClosed=true, $target_charset = DEFAULT_TARGET_CHARSET, $stripRN=true, $defaultBRText=DEFAULT_BR_TEXT, $defaultSpanText=DEFAULT_SPAN_TEXT)
``

$offset = 0 fix problem with php 7.1

HtmlDomParser::file_get_html() returning false for html page

Hi,

First of all, thanks for this great tool. I'm having a little problem. When I use either HtmlDomParser::file_get_html($urlOfThePage), or get the html of the file with curl and use HtmlDomParser::file_get_html($str) for one specific html page, those functions return false. They are perfectly working fine with other pages but this one. Why would that be?

Thanks.

MAX_FILE_SIZE user definition

I've stumble upon edge case where html reached MAX_FILE_SIZE constant, it would be nice to be able to increase it.

It could be implemented really easy just checking if not already defined, then user could redefine it as necessary.

Even better would be exception to know what happened without diving into library code itself.

Parser strips new lines

Hello and thank you for your great work,

I'm using php-simple-html-dom-parser in a free project and try to solve a bug that occurred.

This is my code:

 foreach ( $dom->find( 'text' ) as $element ) {
 				if ( !in_array( $element->parent()->tag, [ 'a', 'pre', 'code' ] ) ) {
 					foreach ( $markers as $marker ) {
 						$text               = $marker[ 'text' ];
 						$url                = $marker[ 'url' ];
 						$tip                = strip_tags( $marker[ 'excerpt' ] );
 						$tooltip            = ( $tooltip ? "data-uk-tooltip title='$tip'" : "" );
 						$tmpval             = "tmpval-$i";
 						$element->innertext = preg_replace(
 							'/\b' . preg_quote( $text, "/" ) . '\b/i',
 							"<a href='$url' $hrefclass target='$target' $tmpval>\$0</a>",
 							$element->innertext,
 							1
 						);

 						$element->innertext = str_replace( $tmpval, $tooltip, $element->innertext );
 						$i++;
 					}
 				}
				
 			}

This code searches for text on a page and replaces words with other words.

It works fine.

But as I found out, this code is removing new lines from <pre><code>...</code></pre>:

This is an example-output using the code above:

<pre><code>&lt;div class=&quot;uk-form-row&quot;&gt;     &lt;label class=&quot;uk-form-label&quot;&gt;{{ &#39;Pages&#39; | trans }}&lt;/label&gt;     &lt;div class=&quot;uk-form-controls uk-form-controls-text&quot;&gt;         &lt;input-tree :active.sync=&quot;package.config.nodes&quot;&gt;&lt;/input-tree&gt;     &lt;/div&gt; &lt;/div&gt; </code></pre>

This is an example-output without using the code above:

<pre><code>&lt;div class=&quot;uk-form-row&quot;&gt;
    &lt;label class=&quot;uk-form-label&quot;&gt;{{ &#39;Pages&#39; | trans }}&lt;/label&gt;
    &lt;div class=&quot;uk-form-controls uk-form-controls-text&quot;&gt;
        &lt;input-tree :active.sync=&quot;package.config.nodes&quot;&gt;&lt;/input-tree&gt;
    &lt;/div&gt;
&lt;/div&gt;
</code></pre>

file_get_html returns false.

file_get_html returns false for this URL: https://tripadvisor.ca/Restaurant_Review-g255344-d724335-Reviews-Dynasty_Chinese_Restaurant-Launceston_Tasmania.html which can be loaded in the browser, but this URL works fine: 'https://tripadvisor.ca'

Use newest version of simplehtmldom

simplehtmldom is currently in version 1.8.1
Why not use the latest version?

I had trouble with the current version because mb_detect_encoding isn't available on all systems. This is fixed in version 1.8.1

file_get_contents(): stream does not support seeking

I have the following warnings when using this library

Warning: file_get_contents(): stream does not support seeking in vendor/sunra/php-simple-html-dom-parser/Src/Sunra/PhpSimple/simplehtmldom_1_5/simple_html_dom.php on line 81

Warning: file_get_contents(): Failed to seek to position -1 in the stream in vendor/sunra/php-simple-html-dom-parser/Src/Sunra/PhpSimple/simplehtmldom_1_5/simple_html_dom.php on line 81


Can someone gives help?

:first-child is not working properly

I found a problem with :first-child psedo-class selector.

For this HTML

<div>
    <a href="javascript:void(0)">&times;</a>
    <div class="links">
        <ul>
            <li>
                <a href="https://github.com/">link 1</a>
                <span>(info)</span>
            </li>
            <li>
                <a href="https://github.com/">link 2</a>
                <span>(info)</span>
            </li>
        </ul>
    </div>
</div>

Selector .links > ul > li:first-child > a matches 0 elements, selector .links > ul > li > a matches two elements.
Expected behavior is that selector .links > ul > li:first-child > a matches this element:

<li>
    <a href="https://github.com/">link 1</a>
    <span>(info)</span>
</li>

Unable to parse HTML

$dom = HtmlDomParser::str_get_html('

欢迎来到。这是我的第一篇文章。最先写作吧!

');

What is the cause of this mistake?

ErrorException : preg_match(): Compilation failed: invalid range in character class at offset 4

at /Users/enle/app/hyena-cms/vendor/sunra/php-simple-html-dom-parser/Src/Sunra/PhpSimple/simplehtmldom_1_5/simple_html_dom.php:1378
1374| $this->char = $this->doc[--$this->pos]; // prev
1375| return true;
1376| }
1377|

1378| if (!preg_match("/^[\w-:]+$/", $tag)) {
1379| $node->_[HDOM_INFO_TEXT] = '<' . $tag . $this->copy_until('<>');
1380| if ($this->char==='<') {
1381| $this->link_nodes($node, false);
1382| return true;

Exception trace:

1 preg_match("/^[\w-:]+$/", "p")
/Users/enle/app/hyena-cms/vendor/sunra/php-simple-html-dom-parser/Src/Sunra/PhpSimple/simplehtmldom_1_5/simple_html_dom.php:1378

2 simplehtmldom_1_5\simple_html_dom::read_tag()
/Users/enle/app/hyena-cms/vendor/sunra/php-simple-html-dom-parser/Src/Sunra/PhpSimple/simplehtmldom_1_5/simple_html_dom.php:1187

3 simplehtmldom_1_5\simple_html_dom::parse()
/Users/enle/app/hyena-cms/vendor/sunra/php-simple-html-dom-parser/Src/Sunra/PhpSimple/simplehtmldom_1_5/simple_html_dom.php:1081

4 simplehtmldom_1_5\simple_html_dom::load("

欢迎来到。这是我的第一篇文章。最先写作吧!

")
/Users/enle/app/hyena-cms/vendor/sunra/php-simple-html-dom-parser/Src/Sunra/PhpSimple/simplehtmldom_1_5/simple_html_dom.php:102

5 simplehtmldom_1_5\str_get_html("

欢迎来到。这是我的第一篇文章。最先写作吧!

")
/Users/enle/app/hyena-cms/vendor/sunra/php-simple-html-dom-parser/Src/Sunra/PhpSimple/HtmlDomParser.php:21

6 call_user_func_array("\simplehtmldom_1_5\str_get_html")
/Users/enle/app/hyena-cms/vendor/sunra/php-simple-html-dom-parser/Src/Sunra/PhpSimple/HtmlDomParser.php:21

7 Sunra\PhpSimple\HtmlDomParser::str_get_html("

欢迎来到。这是我的第一篇文章。最先写作吧!

")
/Users/enle/app/hyena-cms/app/Service/ArticleFormatter.php:296

8 App\Service\ArticleFormatter::convertImage(Object(Closure))
/Users/enle/app/hyena-cms/app/Service/ArticleFormatter.php:91

9 App\Service\ArticleFormatter::importImage()
/Users/enle/app/hyena-cms/app/Console/Commands/WpSynchronizationImage.php:50

10 App\Console\Commands\WpSynchronizationImage::handle()
/Users/enle/app/hyena-cms/vendor/laravel/framework/src/Illuminate/Container/BoundMethod.php:32

11 call_user_func_array([])
/Users/enle/app/hyena-cms/vendor/laravel/framework/src/Illuminate/Container/BoundMethod.php:32

12 Illuminate\Container\BoundMethod::Illuminate\Container{closure}()
/Users/enle/app/hyena-cms/vendor/laravel/framework/src/Illuminate/Container/BoundMethod.php:90

13 Illuminate\Container\BoundMethod::callBoundMethod(Object(Illuminate\Foundation\Application), Object(Closure))
/Users/enle/app/hyena-cms/vendor/laravel/framework/src/Illuminate/Container/BoundMethod.php:34

14 Illuminate\Container\BoundMethod::call(Object(Illuminate\Foundation\Application), [])
/Users/enle/app/hyena-cms/vendor/laravel/framework/src/Illuminate/Container/Container.php:576

15 Illuminate\Container\Container::call()
/Users/enle/app/hyena-cms/vendor/laravel/framework/src/Illuminate/Console/Command.php:183

16 Illuminate\Console\Command::execute(Object(Symfony\Component\Console\Input\ArgvInput), Object(Illuminate\Console\OutputStyle))
/Users/enle/app/hyena-cms/vendor/symfony/console/Command/Command.php:255

17 Symfony\Component\Console\Command\Command::run(Object(Symfony\Component\Console\Input\ArgvInput), Object(Illuminate\Console\OutputStyle))
/Users/enle/app/hyena-cms/vendor/laravel/framework/src/Illuminate/Console/Command.php:170

18 Illuminate\Console\Command::run(Object(Symfony\Component\Console\Input\ArgvInput), Object(Symfony\Component\Console\Output\ConsoleOutput))
/Users/enle/app/hyena-cms/vendor/symfony/console/Application.php:908

19 Symfony\Component\Console\Application::doRunCommand(Object(App\Console\Commands\WpSynchronizationImage), Object(Symfony\Component\Console\Input\ArgvInput), Object(Symfony\Component\Console\Output\ConsoleOutput))
/Users/enle/app/hyena-cms/vendor/symfony/console/Application.php:269

20 Symfony\Component\Console\Application::doRun(Object(Symfony\Component\Console\Input\ArgvInput), Object(Symfony\Component\Console\Output\ConsoleOutput))
/Users/enle/app/hyena-cms/vendor/symfony/console/Application.php:145

21 Symfony\Component\Console\Application::run(Object(Symfony\Component\Console\Input\ArgvInput), Object(Symfony\Component\Console\Output\ConsoleOutput))
/Users/enle/app/hyena-cms/vendor/laravel/framework/src/Illuminate/Console/Application.php:90

22 Illuminate\Console\Application::run(Object(Symfony\Component\Console\Input\ArgvInput), Object(Symfony\Component\Console\Output\ConsoleOutput))
/Users/enle/app/hyena-cms/vendor/laravel/framework/src/Illuminate/Foundation/Console/Kernel.php:122

23 Illuminate\Foundation\Console\Kernel::handle(Object(Symfony\Component\Console\Input\ArgvInput), Object(Symfony\Component\Console\Output\ConsoleOutput))
/Users/enle/app/hyena-cms/artisan:38

Installing php-simple-html-dom-parser in CodeIgniter using composer - doesn't load the library?

Hi,

I installed the library using composer so now I have a folder under "vendor/sunra/php-simple-html-dom-parser/....".
I am pasting my code controller code here (using CodeIgnitrer), and for some reason the library doesn't load properly.
I keep getting the error: "Call to undefined function file_get_html()" when running create_main_array() function.
Is there something that I'm not getting right?
I did include the autoload.php file, like any other library installed with composer and this worked till now.
did the same with use Sunra\PhpSimple\HtmlDomParser;.

<?php 

**require FCPATH. 'vendor/autoload.php';
use Sunra\PhpSimple\HtmlDomParser;**

/******************************************/
/* 		example Scraping			*/
/******************************************/


class Example extends CI_Controller {

    public function __construct() {
    
        parent::__construct();
          
		// Check if the user is logged in else KICK!: 		   
	    if ( ! $this->session->userdata('is_logged_in') ) {
	    	redirect('login');
	    } 

		// Load 'kas_model' Model
		$this->load->model('users_model');
		$this->load->model('expenses_model');
		
		// Sets the server not to have a time out. 
		ini_set('max_execution_time', 0); 
		ini_set('memory_limit', '-1');		
		// More Of MySQL
		ini_set('mysql.connect_timeout','0');
		// Expand the array displays
		ini_set('xdebug.var_display_max_depth', 5);
		ini_set('xdebug.var_display_max_children', 256);
		ini_set('xdebug.var_display_max_data', 1024);
    }

	// Main Page
	public function index(){
		$this->load->view('header');
		$this->load->view('dashboard');
		$this->load->view('example/main_example');
		$this->load->view('footer');
	}


	// Gets a page [string] variable and returns a string of the HTML. 
	public function scrape_page($page) {

		// $string = file_get_contents($page);
		$string = **file_get_html**($page);
		
		return $string;

	}


	// Running this controller
	public function create_main_array() {

		**$string = $this->scrape_page('https://example.com/websites');

		// Find all images 
		foreach($string->find('img') as $element) 
		       echo $element->src . '<br>';

		// Find all links 
		foreach($string->find('a') as $element) 
		       echo $element->href . '<br>';**

	}

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.