Comments (17)
Hi @BenxiaHu,
This is really not appropriate language to use to ask for someone to implement a software feature for you.
@ALL, as Silvia mentioned above, the current reported location of the feature corresponds to the actual position of the feature on the matrix, rather than the genomic coordinates of the feature. This is implemented as such in this version of the software since which coordinates to report can heavily depend on the type of feature being detected (eg, TAD, stripe, loop). This makes it difficult to come up with a default approach for which genomic coordinates to report. For the time being, you can use Silvia's explanation above to compute these coordinates yourself. We will review the documentation and include further guidance if necessary.
In addition, we always encourage potential contributors to write pull requests with features of interest, so we can review the, and incorporate in an updated if appropriate.
I hope that helps.
Best,
Juanma Vaquerizas
your answer really does not mean anything.
what we want is just to retrieve the genemoe coordinates based on gained/lost features.
Do you understand our question?
from chess.
Hi @sgalan thank you for your answer.
I understand that this is going to be a general issue for a lot of people i.e. it is not so trivial to understand. Don't you also have the same issue when analyzing data for your own papers?
If so, would you be so nice to share in the chess package a script allowing to retrieve the regions involved in a given extracted feature like a loop?
Thanks a million for your time and patience
from chess.
Hi,
please have a look at the documentation for FAN-C, where you can find detailed descriptions of the plotting API, which I think is going to be very useful in addressing your questions: https://fan-c.readthedocs.io/en/latest/api/plot.html
If you feel like this may be beyond your expertise, feel free to contact the Vaquerizas lab to enquire about a possible collaboration: https://www.vaquerizaslab.org/contact/
from chess.
OK. I thought having the margin working in your demo script would be of general interest. I'll check the docs to find out how to get the OE from the matrix. I also assumed it was an easy one for you. Sorry for bothering. Thank you.
from chess.
sorry I closed this too fast.
Reading the fanc doc, I managed to plot what I want wrt point 1 above. But I am still trying to understand how I can add margin when the window size is smaller to improve the display. I can't find information on the format of eg gained_features.tsv
. Could you please hint?
from chess.
Hi @cgirardot!
Do I understand correctly that you want to plot a large region (larger than a single window in your chess sim
run) and display all the rectangles for the extracted features in that?
If yes, then I am sorry to say that we currently have no code for that. The right person to help here is @sgalan, but for now maybe I can provide some information that will help: in gained_features.csv
each row corresponds to a single rectangle, i.e. a single feature extracted. The first columns are, in that order:
- the pair id (same as the first column in the
chess sim
output) of the region the feature is in - an id for that specific feature
- x max
- x min
- y max
- y min, which are the four corners of the rectangle marking the feature in the region matrix.
In order to plot these rectangles at the correct positions in a larger matrix (e.g. with margins), the x and y coordinates have to be transformed. I don't know how to do this at the moment. If @sgalan has no easy solution I will mark this as requested enhancement and place it on our to-do list.
I hope that helps a bit, good luck!
from chess.
@cgirardot , also thanks for pointing out that the information about the columns is missing in the docs, have added this.
from chess.
I am a bit confused with the formats. Let s take an example.
After cross-correlation, I have (in subregions_4_clusters_lost):
So I should look in "lost_features" for the feature IDs 20 to 23 :
And here is a view on this region :
we are looking at the blue boxes in the middle plot.
- why do I have only 2 rectangles but 4 features in the
lost_features
file? My guess is it is because it is symmetrical ? - Are the coordinates in
lost_features
in "bins" ? The region on display is chr3L:80000002-10000001, what is the easy way to extract the regions involved in e.g. feature 20? I am sure it is too late and I am missing the obvious here.
from chess.
from chess.
Hi @sgalan
thanks for the confirmation. Could you also hint on the last question, which is the most important to move on for me. If you look at the data and images above: how can I extract the coordinates of regions involved in e.g. feature 20?
thx
from chess.
from chess.
Hello @sgalan ,
I am a bit confused with the formula you provided. For example with the loops found in @cgirardot 's data, the x and y coordinates of the rectangle don't refer to the genomic regions that interact (the x and y are on a horizontal and vertical axes while the interaction is between two regions found by following the diagonals).
How can we get the coordinates of the regions involved in the loop?
Best,
Perrine
from chess.
I realised that my previous comment may be unclear.
To be more precise, I have read the code regarding extracting the features and it seems to me that you scale (zoom clipped) and rotate the matrix before looking for the features. Is that correct?
If it is the case, it would explain why the x coordinates of symmetric features are the same (while x and y coordinates should be reversed between symmetric features in the original matrix), and also why the feature areas are rectangles and not diamond-shaped as it is usually the case in HiC matrices (especially the TADs).
If I understood the code correctly, it also means that the x and y coordinates retrieved from the gained and lost files do not correspond to the regions involved in the loop, but rather to the coordinates of the feature on the picture.
Is it correct?
Best,
Perrine
from chess.
@sgalan I was wondering if you could comment on @PerrineLacour questions?
from chess.
from chess.
from chess.
I marked this as 'good first issue'; if we find time for this we might implement this ourselves, but as Juanma said, pull requests with solutions are more than welcome.
from chess.
Related Issues (20)
- chess --version doesn't work?
- CNV bias in normalization HOT 2
- Conditions for conservation analysis of syntenic blocks HOT 5
- Nan Continued HOT 2
- No valid region pairs found? HOT 1
- Different resolution produce different result HOT 1
- Should the users be concerned about the problem raised in the new Contradictory Results bioRxiv preprint? HOT 2
- conservation analysis when only a few syntenic blocks are available HOT 3
- speed up the chess run HOT 1
- error of the chess extract HOT 3
- issue of normalized/chess extract HOT 1
- error on running chess sim HOT 2
- error when running extract on .hic files HOT 1
- something different from plotting HOT 9
- _pickle.PicklingError HOT 2
- chess extract error: operands could not be broadcast together with shapes HOT 1
- data_range parameter not specified - error HOT 7
- Chess sim output .tsv file explained HOT 1
- Normalization of .hic files HOT 1
- Deprecated parameters in scikit-image & scikit-learn
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from chess.