Giter VIP home page Giter VIP logo

Comments (13)

IzzySoft avatar IzzySoft commented on June 17, 2024

From what you've quoted, try:

preg_match_all('!<div class="sodatext">\s*<p>\s*(.*?)\s*</p>!ims',str_replace("\n"," ",$this->page["Quotes"]),$matches)

Though that might give you only the first paragraph of each quote (your quoted code doesn't show the "next element"). If you don't mind having the <p> within:

preg_match_all('!<div class="sodatext">\s*(.*?)\s*</div>!ims',str_replace("\n"," ",$this->page["Quotes"]),$matches)

might be a better fit. (not verified, just inspired by the source given in your question)

from imdbphp.

duck7000 avatar duck7000 commented on June 17, 2024

Thanks for the quick reply

I tested with this movie: http://www.imdb.com/title/tt0450385/quotes?ref_=ttalt_ql_4
I'm sorry if i'm not detailed enough but here is the whole code

<div id="qt0396883" class="quote soda sodavote odd" ><div class="sodatext">
<p>
<a href="/name/nm0000131/?ref_=tt_trv_qu"
><span class="character">Mike Enslin</span></a>:
[<span class="fine">Olin gives Enslin the room key</span>]
Most hotels have switched to magnetics. An actual key. That's a nice touch, it's antiquey. </p><p>
<a href="/name/nm0000168/?ref_=tt_trv_qu"
><span class="character">Gerald Olin</span></a>:
We have magnetic cards also, but electronics don't seem to work in 1408. Hope you don't have a pacemaker. </p><p>
<a href="/name/nm0000131/?ref_=tt_trv_qu"
><span class="character">Mike Enslin</span></a>:
[<span class="fine">into his tape recorder</span>]
General manager claims that the phantom in room interferes... </p><p>
<a href="/name/nm0000168/?ref_=tt_trv_qu"
><span class="character">Gerald Olin</span></a>:
I have *never* used the word "phantom." </p><p>
<a href="/name/nm0000131/?ref_=tt_trv_qu"
><span class="character">Mike Enslin</span></a>:
Oh, I'm sorry. Uh, spirit? Specter? </p><p>
<a href="/name/nm0000168/?ref_=tt_trv_qu"
><span class="character">Gerald Olin</span></a>:
No, you misunderstand. Whatever's in 1408 is nothing like that. </p><p>
<a href="/name/nm0000131/?ref_=tt_trv_qu"
><span class="character">Mike Enslin</span></a>:
Then what is it? </p><p>
<a href="/name/nm0000168/?ref_=tt_trv_qu"
><span class="character">Gerald Olin</span></a>:
It's an evil fucking room. </p></div>

I will try your second solution, if that works fine i strip those p tags off in my program
Thanks
Ed

from imdbphp.

duck7000 avatar duck7000 commented on June 17, 2024

both preg_match_all works, but give a foreach error invalid argument. I changed the foreach to index 1 and that works fine!

however the method is replacing line breaks with space so all line breaks within each paragraph are also gone..

from imdbphp.

IzzySoft avatar IzzySoft commented on June 17, 2024

New lines are just for code formatting here. Especially with the second variant, where you get full HTML returned (including <P>aragraphs, the \newline characters should not matter.

It might just be a quick "work-around" for you. I no longer maintain the code (I did for 12 years, but a while ago Tom has taken over; currently I'm not even use IMDBPHP myself) – so for the final fix, better let @tboothman (Tom) decide.

from imdbphp.

duck7000 avatar duck7000 commented on June 17, 2024

Oke fair enough i'll wait for tbootman to reply

ps the first preg_match_all does indeed capture only the first paragrah, the second all inside the div
for now i use the second one as a work arround

Thanks
Ed

from imdbphp.

IzzySoft avatar IzzySoft commented on June 17, 2024

Ah, it's Ed :) Yepp, I thought so when setting up the "quick-fix regexes". Tom will do the fine-tuning and provide a real solution – while you (and others finding this issue here) at least have a work-around until then.

from imdbphp.

tboothman avatar tboothman commented on June 17, 2024

Ahh, I see how it is. You get the glory and I write the tests :P

from imdbphp.

tboothman avatar tboothman commented on June 17, 2024

That output is really dirty. Really each quote should be split into lines and which character said them and who the actor is. I'm not particularly interested in doing the grunt work on that though and it'd change the return signature.

A 'quote' from the matrix

<p>
<a href="/name/nm0000206/?ref_=tt_trv_qu"
><span class="character">Neo</span></a>:
I thought it wasn't real </p><p>
<a href="/name/nm0000401/?ref_=tt_trv_qu"
><span class="character">Morpheus</span></a>:
Your mind makes it real </p><p>
<a href="/name/nm0000206/?ref_=tt_trv_qu"
><span class="character">Neo</span></a>:
If you're killed in the matrix, you die here? </p><p>
<a href="/name/nm0000401/?ref_=tt_trv_qu"
><span class="character">Morpheus</span></a>:
The body cannot live without the mind </p>

from imdbphp.

IzzySoft avatar IzzySoft commented on June 17, 2024

Ahh, I see how it is. You get the glory and I write the tests :P

If you insist, I agree with that 😺

Honestly: I just wanted to provide a "quick work-around", giving you time for a good solution. What I wrote is just a quick-and-dirty hack, and certainly needs some "fine-tuning" at least (and rather a lot of the latter, I'm afraid) …

from imdbphp.

duck7000 avatar duck7000 commented on June 17, 2024

So this is the solution for now?
I agree that the output is rough though..

i think that , if it's possible, to somehow capture all paragraphs in the div's would be a nice solution?
And then split the capture in name, imdbid and comment?

I understand that would be a lot of more work though

from imdbphp.

IzzySoft avatar IzzySoft commented on June 17, 2024

@duck7000 as I understand Tom, it means he's still working on a "good solution" and considers mine a "usable work-around, not clean enough to be accepted as solution" – as I do. I fully trust him to come up with something much better and cleaner than my "quick hack", which certainly will have a "Wow!" effect. I know why I handed over the project to Tom: he's at least as perfectionist as I am, so no worries 😜

Give him a little time, though – IMDB is tricky, and all of us want something that stays working a little longer. Perfectionists' work takes a bit more – but then it's, well, perfect 😺

from imdbphp.

tboothman avatar tboothman commented on June 17, 2024

Well ... I've 'fixed' the method back to how it was before. I don't really think what it's producing is particularly good but that is what the method says it does.

If you want to make a method that does something more useful I wouldn't be opposed to writing some tests for it. I think I'd recommend using xmldom / xpath for this one though because it's quite nicely structured. http://www.imdb.com/title/tt0133093/quotes has quite a few variations including stage directions as its own line.

from imdbphp.

duck7000 avatar duck7000 commented on June 17, 2024

Oke fair enough, the method now works like it always did i guess, although i never used it before..
I'm upgrading my program so i thought it would be nice to get as much info from imdb as possible.

So i accept that you both wouldn't spend the time to make it better
Thanks for the fix and input

from imdbphp.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.