Giter VIP home page Giter VIP logo

Comments (5)

JorjMcKie avatar JorjMcKie commented on July 22, 2024 1

Let me have the PDF please. This is required for following up

from pymupdf.

JorjMcKie avatar JorjMcKie commented on July 22, 2024

I don't understand what the problem is that you are reporting here.
Please re-assess the situation as follows:

  1. Load the page, page = doc[page_no]
  2. Execute page.remove_rotation()
  3. Now extract stuff from the page, for instance text or drawings, or insert text, drawings or images.

If the behavior in point 3 is not as expected, only then we have a bug.

from pymupdf.

Nageswarachand avatar Nageswarachand commented on July 22, 2024

Clear explanation about the problem

Below code load the pdf input and use the get_drawings() feature to extract the width of the item.
(orientation of this input is 90 degree)

doc = fitz.open(pdf_path)

orientation = doc[page_num].rotation
print("orientation ---", orientation)

for page_num in range(doc.page_count):
    page = doc[page_num]
    drawing_list=page.get_drawings()
    
for item in drawing_list[:15]:
    print(item['width'])

output:
image

Below code load the pdf input , remove_rotation() remove the orientation of the input from 90 to 0 and use the get drawings feature to extract the width of the item.

doc = fitz.open(pdf_path)

for page_num in range(0,doc.page_count):
        doc[page_num].remove_rotation()

doc[page_num].rotation

for page_num in range(doc.page_count):
    page = doc[page_num]
    drawing_list=page.get_drawings()
    
for item in drawing_list[:15]:
    print(item['width'])

output:
image

The problem is here , input pdf's used for both code is same, but item width is none or 0 for all. I have just printed 15 items in drawing_list, but it gives output 0 for all item width's when I use remove_rotation() feature.

Queries

  1. why this item width's are none for everything? (i perfomed some operations using this item width's , but it gave none for all items)
  2. i have used pymupdf version 1.24.3, is this version problem?

or else
3. can you suggest any other feature in pymupdf? ( I don't want to rotate the page, i want to change the orientation of the page and respective coordinates of the page according to orientation change. Consider proper orientations are 0, 90,180,270)

from pymupdf.

Nageswarachand avatar Nageswarachand commented on July 22, 2024

There is a input pdf.

Grace manor-mid rise floor and columnschedule.pdf

Please suggest me any other way or solve issue in this feature, because remove_rotation() features works fine but it provide's item width none or 0, it is necessary for my future operations in pdf.

from pymupdf.

JorjMcKie avatar JorjMcKie commented on July 22, 2024

I checked the results of page.remove_rotation(), and it does behave as expected.
I can't I really understood your problem.
But please be aware, that the results of page.get_drawings() with the rotated versus the derotated page may differ significantly. For example the first path of original page 0 (90° rotated) is this:

{'closePath': None,
 'color': None,
 'dashes': None,
 'even_odd': False,
 'fill': (1.0, 1.0, 1.0),
 'fill_opacity': 1.0,
 'items': [('re', Rect(938.3999633789062, 1509.5999755859375, 948.0, 1519.4400634765625), 1)],
 'layer': '',
 'lineCap': None,
 'lineJoin': None,
 'rect': Rect(938.3999633789062, 1509.5999755859375, 948.0, 1519.4400634765625),
 'seqno': 0,
 'stroke_opacity': None,
 'type': 'f',
 'width': None}

The same path after derotation of the page looks like this:

{'closePath': False,
  'color': None,
  'dashes': None,
  'even_odd': False,
  'fill': (1.0, 1.0, 1.0),
  'fill_opacity': 1.0,
  'items': [('l', Point(1504.5599365234375, 938.3999633789062), Point(1504.5599365234375, 948.0)),
            ('l', Point(1504.5599365234375, 948.0), Point(1514.4000244140625, 948.0)),
            ('l', Point(1514.4000244140625, 948.0), Point(1514.4000244140625, 938.3999633789062)),
            ('l', Point(1514.4000244140625, 938.3999633789062), Point(1504.5599365234375, 938.3999633789062))],
  'layer': '',
  'lineCap': None,
  'lineJoin': None,
  'rect': Rect(1504.5599365234375, 938.3999633789062, 1514.4000244140625, 948.0),
  'seqno': 0,
  'stroke_opacity': None,
  'type': 'f',
  'width': None}

Yet, both paths refer to the same path which you can see when multiplying the original paths[0]["rect"] with page.rotation_matrix ... which gives the coordinates as if the page were not rotated:

paths[0]["rect"] * page.rotation_matrix
Rect(1504.5599365234375, 938.3999633789062, 1514.4000244140625, 948.0)

This is visibly the same rectangle of the first path after de-rotation.

So maybe this is a way for you to circumvent the problem.

from pymupdf.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.