Giter VIP home page Giter VIP logo

Comments (6)

psaiz avatar psaiz commented on June 24, 2024

Thanks for the ticket. It looks like there are two different issues in this ticket:

  • cernopendata-client inconsistent with the results from the web.
  • Issues with the updates
    In this ticket I will focus on the second issue. For the first one, it might be better to create a dedicated ticket on the cernopendata-client repo.

I've created an empty instance locally, populated it only with the file mentioned in the ticket, then executed the command again in replace mode, and I can't reproduce the issue yet. I'll do the replace several more times to see if I can reproduce it.

Do you have the same issue with other files? This one is the largest, with the most entries to process. Since all the entries are processed in one transaction, I wonder if the transaction is too big.

From the logs posted in this ticket, you don't have access anymore to the message of the exception mentioned in This Session's transaction has been rolled back due to a previous exception during flush., do you?

from opendata.cern.ch.

tiborsimko avatar tiborsimko commented on June 24, 2024

cernopendata-client inconsistent with the results from the web

I don't think there is any problem with cernopendata-client as such. I used it simply to automatically discover problems. You can find the same problems manually by browsing the web as well. Or by calling the REST API directly, such as:

$ curl http://opendata-qa.cern.ch/api/records/7794

{
  "message": "The server could not verify that you are authorized to access the URL requested. You either supplied the wrong credentials (e.g. a bad password), or your browser doesn't understand how to supply the credentials required.",
  "status": 401
}

from opendata.cern.ch.

tiborsimko avatar tiborsimko commented on June 24, 2024

I've created an empty instance locally, populated it only with the file mentioned in the ticket, then executed the command again in replace mode, and I can't reproduce the issue yet.

Have you followed exactly the procedure I mentioned? I.e. load old file (without file information of concerned records), then update with new file, and then re-update once again? I can reproduce the problem in this way.

Do you have the same issue with other files?

Haven't tried with other records, since I wanted to reproduce locally exactly the problem we were seeing on DEV and QA. But I can try to reproduce with a small file if it would be useful.

From the logs posted in this ticket, you don't have access anymore to the message of the exception mentioned in This Session's transaction has been rolled back due to a previous exception during flush., do you?

Nope, but I have just reproduced the problem, so here it is:

...
Record recid 7776 updated.
Record recid 7781 updated.
Record recid 7782 updated.
Record recid 7784 updated.
Record recid 7785 updated.
Record recid 7786 updated.
Recid 7787 file CMS_mc_Summer12_DR53X_GluGlu_NMSSM_BBandA1_A1ToMuMu_mA1-40_8TeV_pythia6_AODSIM_PU_S10_START53_V19-v1_20000_file_index.json could not be loaded due to (psycopg2.errors.UniqueViolation) duplicate key value violates unique constraint "uq_files_files_uri"
DETAIL:  Key (uri)=(root://eospublic.cern.ch//eos/opendata/cms/mc/Summer12_DR53X/GluGlu_NMSSM_BBandA1_A1ToMuMu_mA1-40_8TeV_pythia6/AODSIM/PU_S10_START53_V19-v1/file-indexes/CMS_mc_Summer12_DR53X_GluGlu_NMSSM_BBandA1_A1ToMuMu_mA1-40_8TeV_pythia6_AODSIM_PU_S10_START53_V19-v1_20000_file_index.json) already exists.

[SQL: INSERT INTO files_files (created, updated, id, uri, storage_class, size, checksum, readable, writable, last_check_at, last_check) VALUES (%(created)s, %(updated)s, %(id)s, %(uri)s, %(storage_class)s, %(size)s, %(checksum)s, %(readable)s, %(writable)s, %(last_check_at)s, %(last_check)s)]
[parameters: {'created': datetime.datetime(2024, 5, 14, 6, 52, 9, 585255), 'updated': datetime.datetime(2024, 5, 14, 6, 52, 9, 585262), 'id': UUID('9b73ef1d-d5af-4f78-867a-af2695fb9a9a'), 'uri': 'root://eospublic.cern.ch//eos/opendata/cms/mc/Summer12_DR53X/GluGlu_NMSSM_BBandA1_A1ToMuMu_mA1-40_8TeV_pythia6/AODSIM/PU_S10_START53_V19-v1/file-indexes/CMS_mc_Summer12_DR53X_GluGlu_NMSSM_BBandA1_A1ToMuMu_mA1-40_8TeV_pythia6_AODSIM_PU_S10_START53_V19-v1_20000_file_index.json', 'storage_class': 'S', 'size': 4305, 'checksum': 'adler32:dc00a245', 'readable': True, 'writable': False, 'last_check_at': None, 'last_check': True}]
(Background on this error at: https://sqlalche.me/e/14/gkpj).
Recid 7787 file CMS_mc_Summer12_DR53X_GluGlu_NMSSM_BBandA1_A1ToMuMu_mA1-40_8TeV_pythia6_AODSIM_PU_S10_START53_V19-v1_20000_file_index.txt could not be loaded due to This Session's transaction has been rolled back due to a previous exception during flush. To begin a new transaction with this Session, first issue Session.rollback(). Original exception was: (psycopg2.errors.UniqueViolation) duplicate key value violates unique constraint "uq_files_files_uri"
DETAIL:  Key (uri)=(root://eospublic.cern.ch//eos/opendata/cms/mc/Summer12_DR53X/GluGlu_NMSSM_BBandA1_A1ToMuMu_mA1-40_8TeV_pythia6/AODSIM/PU_S10_START53_V19-v1/file-indexes/CMS_mc_Summer12_DR53X_GluGlu_NMSSM_BBandA1_A1ToMuMu_mA1-40_8TeV_pythia6_AODSIM_PU_S10_START53_V19-v1_20000_file_index.json) already exists.

[SQL: INSERT INTO files_files (created, updated, id, uri, storage_class, size, checksum, readable, writable, last_check_at, last_check) VALUES (%(created)s, %(updated)s, %(id)s, %(uri)s, %(storage_class)s, %(size)s, %(checksum)s, %(readable)s, %(writable)s, %(last_check_at)s, %(last_check)s)]
[parameters: {'created': datetime.datetime(2024, 5, 14, 6, 52, 9, 585255), 'updated': datetime.datetime(2024, 5, 14, 6, 52, 9, 585262), 'id': UUID('9b73ef1d-d5af-4f78-867a-af2695fb9a9a'), 'uri': 'root://eospublic.cern.ch//eos/opendata/cms/mc/Summer12_DR53X/GluGlu_NMSSM_BBandA1_A1ToMuMu_mA1-40_8TeV_pythia6/AODSIM/PU_S10_START53_V19-v1/file-indexes/CMS_mc_Summer12_DR53X_GluGlu_NMSSM_BBandA1_A1ToMuMu_mA1-40_8TeV_pythia6_AODSIM_PU_S10_START53_V19-v1_20000_file_index.json', 'storage_class': 'S', 'size': 4305, 'checksum': 'adler32:dc00a245', 'readable': True, 'writable': False, 'last_check_at': None, 'last_check': True}]
(Background on this error at: https://sqlalche.me/e/14/gkpj) (Background on this error at: https://sqlalche.me/e/14/7s2a).
Traceback (most recent call last):
  File "/opt/invenio/var/instance/python/bin/cernopendata", line 33, in <module>
    sys.exit(load_entry_point('cernopendata', 'console_scripts', 'cernopendata')())
  File "/opt/invenio/var/instance/python/lib/python3.9/site-packages/click/core.py", line 1157, in __call__
    return self.main(*args, **kwargs)
  File "/opt/invenio/var/instance/python/lib/python3.9/site-packages/click/core.py", line 1078, in main
    rv = self.invoke(ctx)
  File "/opt/invenio/var/instance/python/lib/python3.9/site-packages/click/core.py", line 1688, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/opt/invenio/var/instance/python/lib/python3.9/site-packages/click/core.py", line 1719, in invoke
    rv.append(sub_ctx.command.invoke(sub_ctx))
  File "/opt/invenio/var/instance/python/lib/python3.9/site-packages/click/core.py", line 1434, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/opt/invenio/var/instance/python/lib/python3.9/site-packages/click/core.py", line 783, in invoke
    return __callback(*args, **kwargs)
  File "/opt/invenio/var/instance/python/lib/python3.9/site-packages/click/decorators.py", line 33, in new_func
    return f(get_current_context(), *args, **kwargs)
  File "/opt/invenio/var/instance/python/lib/python3.9/site-packages/flask/cli.py", line 357, in decorator
    return __ctx.invoke(f, *args, **kwargs)
  File "/opt/invenio/var/instance/python/lib/python3.9/site-packages/click/core.py", line 783, in invoke
    return __callback(*args, **kwargs)
  File "/code/cernopendata/modules/fixtures/cli.py", line 249, in records
    record.files.flush()
  File "/opt/invenio/var/instance/python/lib/python3.9/site-packages/invenio_records_files/api.py", line 270, in files
    record_id=self.id
  File "/opt/invenio/var/instance/python/lib/python3.9/site-packages/invenio_records/api.py", line 87, in id
    return self.model.id if self.model else None
  File "/opt/invenio/var/instance/python/lib/python3.9/site-packages/sqlalchemy/orm/attributes.py", line 487, in __get__
    return self.impl.get(state, dict_)
  File "/opt/invenio/var/instance/python/lib/python3.9/site-packages/sqlalchemy/orm/attributes.py", line 959, in get
    value = self._fire_loader_callables(state, key, passive)
  File "/opt/invenio/var/instance/python/lib/python3.9/site-packages/sqlalchemy/orm/attributes.py", line 990, in _fire_loader_callables
    return state._load_expired(state, passive)
  File "/opt/invenio/var/instance/python/lib/python3.9/site-packages/sqlalchemy/orm/state.py", line 712, in _load_expired
    self.manager.expired_attribute_loader(self, toload, passive)
  File "/opt/invenio/var/instance/python/lib/python3.9/site-packages/sqlalchemy/orm/loading.py", line 1451, in load_scalar_attributes
    result = load_on_ident(
  File "/opt/invenio/var/instance/python/lib/python3.9/site-packages/sqlalchemy/orm/loading.py", line 407, in load_on_ident
    return load_on_pk_identity(
  File "/opt/invenio/var/instance/python/lib/python3.9/site-packages/sqlalchemy/orm/loading.py", line 530, in load_on_pk_identity
    session.execute(
  File "/opt/invenio/var/instance/python/lib/python3.9/site-packages/sqlalchemy/orm/session.py", line 1665, in execute
    ) = compile_state_cls.orm_pre_session_exec(
  File "/opt/invenio/var/instance/python/lib/python3.9/site-packages/sqlalchemy/orm/context.py", line 312, in orm_pre_session_exec
    session._autoflush()
  File "/opt/invenio/var/instance/python/lib/python3.9/site-packages/sqlalchemy/orm/session.py", line 2253, in _autoflush
    self.flush()
  File "/opt/invenio/var/instance/python/lib/python3.9/site-packages/sqlalchemy/orm/session.py", line 3449, in flush
    self._flush(objects)
  File "/opt/invenio/var/instance/python/lib/python3.9/site-packages/sqlalchemy/orm/session.py", line 3478, in _flush
    self.dispatch.before_flush(self, flush_context, objects)
  File "/opt/invenio/var/instance/python/lib/python3.9/site-packages/sqlalchemy/event/attr.py", line 247, in __call__
    fn(*args, **kw)
  File "/opt/invenio/var/instance/python/lib/python3.9/site-packages/sqlalchemy_continuum/manager.py", line 343, in before_flush
    uow = self.unit_of_work(session)
  File "/opt/invenio/var/instance/python/lib/python3.9/site-packages/sqlalchemy_continuum/manager.py", line 305, in unit_of_work
    conn = session.connection()
  File "/opt/invenio/var/instance/python/lib/python3.9/site-packages/sqlalchemy/orm/session.py", line 1545, in connection
    return self._connection_for_bind(
  File "/opt/invenio/var/instance/python/lib/python3.9/site-packages/sqlalchemy/orm/session.py", line 1555, in _connection_for_bind
    return self._transaction._connection_for_bind(
  File "/opt/invenio/var/instance/python/lib/python3.9/site-packages/sqlalchemy/orm/session.py", line 724, in _connection_for_bind
    self._assert_active()
  File "/opt/invenio/var/instance/python/lib/python3.9/site-packages/sqlalchemy/orm/session.py", line 604, in _assert_active
    raise sa_exc.PendingRollbackError(
sqlalchemy.exc.PendingRollbackError: This Session's transaction has been rolled back due to a previous exception during flush. To begin a new transaction with this Session, first issue Session.rollback(). Original exception was: (psycopg2.errors.UniqueViolation) duplicate key value violates unique constraint "uq_files_files_uri"
DETAIL:  Key (uri)=(root://eospublic.cern.ch//eos/opendata/cms/mc/Summer12_DR53X/GluGlu_NMSSM_BBandA1_A1ToMuMu_mA1-40_8TeV_pythia6/AODSIM/PU_S10_START53_V19-v1/file-indexes/CMS_mc_Summer12_DR53X_GluGlu_NMSSM_BBandA1_A1ToMuMu_mA1-40_8TeV_pythia6_AODSIM_PU_S10_START53_V19-v1_20000_file_index.json) already exists.

[SQL: INSERT INTO files_files (created, updated, id, uri, storage_class, size, checksum, readable, writable, last_check_at, last_check) VALUES (%(created)s, %(updated)s, %(id)s, %(uri)s, %(storage_class)s, %(size)s, %(checksum)s, %(readable)s, %(writable)s, %(last_check_at)s, %(last_check)s)]
[parameters: {'created': datetime.datetime(2024, 5, 14, 6, 52, 9, 585255), 'updated': datetime.datetime(2024, 5, 14, 6, 52, 9, 585262), 'id': UUID('9b73ef1d-d5af-4f78-867a-af2695fb9a9a'), 'uri': 'root://eospublic.cern.ch//eos/opendata/cms/mc/Summer12_DR53X/GluGlu_NMSSM_BBandA1_A1ToMuMu_mA1-40_8TeV_pythia6/AODSIM/PU_S10_START53_V19-v1/file-indexes/CMS_mc_Summer12_DR53X_GluGlu_NMSSM_BBandA1_A1ToMuMu_mA1-40_8TeV_pythia6_AODSIM_PU_S10_START53_V19-v1_20000_file_index.json', 'storage_class': 'S', 'size': 4305, 'checksum': 'adler32:dc00a245', 'readable': True, 'writable': False, 'last_check_at': None, 'last_check': True}]
(Background on this error at: https://sqlalche.me/e/14/gkpj) (Background on this error at: https://sqlalche.me/e/14/7s2a)
/usr/lib64/python3.9/site-packages/XRootD/client/finalize.py:46: DeprecationWarning: Importing 'itsdangerous.json' is deprecated and will be removed in ItsDangerous 2.1. Use Python's 'json' module instead.
  if isinstance(obj, File) and obj.is_open():

from opendata.cern.ch.

psaiz avatar psaiz commented on June 24, 2024

The permission error is likely related to the permission of the file on eos:

[psaiz@aiadm08 ~]$ ls -al /eos/opendata/cms/mc/Summer12_DR53X/GluGlu_NMSSM_H2ToH1H1_H1To2Mu2B_mH2-125_mH1-60_8TeV_pythia6/AODSIM/PU_S10_START53_V19-v1/file-indexes/
total 16
drwxr-xr-x. 2 simko    us     4096 Apr 30 15:07 .
drwxr-xr-x. 2 cmsrucio def-cg 4096 Apr 30 15:07 ..
-rw-r-----. 1 simko    us     4777 Apr 30 15:07 CMS_mc_Summer12_DR53X_GluGlu_NMSSM_H2ToH1H1_H1To2Mu2B_mH2-125_mH1-60_8TeV_pythia6_AODSIM_PU_S10_START53_V19-v1_00000_file_index.json
-rw-r-----. 1 simko    us     2772 Apr 30 15:07 CMS_mc_Summer12_DR53X_GluGlu_NMSSM_H2ToH1H1_H1To2Mu2B_mH2-125_mH1-60_8TeV_pythia6_AODSIM_PU_S10_START53_V19-v1_00000_file_index.txt

Changing the permission there might solve the issue.
Thanks for the info for the duplicate. I'll see if it I can reproduce it

from opendata.cern.ch.

tiborsimko avatar tiborsimko commented on June 24, 2024

The permission error is likely related to the permission of the file on eos:

It shouldn't be related to the index file permissions, because from another open data deployment (that points to the same index file) the record is well accessible, e.g. compare:

$ curl http://opendata-qa.cern.ch/api/records/7794
$ curl http://opendata-dev.cern.ch/api/records/7794

from opendata.cern.ch.

tiborsimko avatar tiborsimko commented on June 24, 2024

Haven't tried with other records, since I wanted to reproduce locally exactly the problem we were seeing on DEV and QA. But I can try to reproduce with a small file if it would be useful.

I have managed to reproduce the problem with a file containing a single record. Here's the recipe:

$ docker exec -i -t opendatacernch-web-1 /code/scripts/populate-instance.sh --skip-records --skip-glossary --skip-docs
$ cat cernopendata/modules/fixtures/data/records/cms-tools-vm-image-2012.json | jq 'del( .[] ["files"])' > cernopendata/modules/fixtures/data/records/cms-tools-vm-image-2012-nofiles.json 
$ docker exec -i -t opendatacernch-web-1 cernopendata fixtures records --mode insert -f cernopendata/modules/fixtures/data/records/cms-tools-vm-image-2012-nofiles.json
$ docker exec -i -t opendatacernch-web-1 cernopendata fixtures records --mode insert-or-replace -f cernopendata/modules/fixtures/data/records/cms-tools-vm-image-2012.json
$ docker exec -i -t opendatacernch-web-1 cernopendata fixtures records --mode insert-or-replace -f cernopendata/modules/fixtures/data/records/cms-tools-vm-image-2012.json

from opendata.cern.ch.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.