Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MAGIC on RNA + ATAC data #210

Open
yamajackr opened this issue Oct 28, 2022 · 5 comments
Open

MAGIC on RNA + ATAC data #210

yamajackr opened this issue Oct 28, 2022 · 5 comments
Labels

Comments

@yamajackr
Copy link

yamajackr commented Oct 28, 2022

Hi @scottgigante
Thank you for the great tool. I want to impute data of 10X genomics scMultiome dataset. Applying MAGIC on ATAC-seq has been benchmarked here. https://doi.org/10.1093/bib/bbab442
I'm considering applying the distance matrix from the weighted nearest neighbour distance in Seurat to MAGIC. Is it reasonable?

@scottgigante
Copy link
Contributor

@YamamotoRyo this is a totally valid use case, yes. You can pass the distance matrix to a graphtools.Graph with precomputed='distance', and then pass this graph to MAGIC.fit(X, graph=graph).

@yamajackr
Copy link
Author

@scottgigante Thank you so much!
I will try it!

@yamajackr
Copy link
Author

yamajackr commented Nov 2, 2022

Hi @scottgigante,
I tried that code. I used the affinity matrix and generated a graph.
Screenshot 2022-11-02 at 3 48 17 PM

I could make a magic operator using the graph. But transformation failed.

Screenshot 2022-11-02 at 3 53 16 PM

affi =  pd.read_csv('affinity_mat.tsv',header=0, sep='\t', index_col=0)
data = affi.to_numpy()
graph = graphtools.Graph(data, precomputed='affinity')
magic_op_g = magic.MAGIC()
magic_op_g = magic_op_g.fit(X=X,  graph=graph)
X_magic = magic_op_g.transform()

Error

magic_op_g = magic_op_g.fit(X=df, graph=graph)
Running MAGIC on 1729 cells and 21470 genes.
Using precomputed graph and diffusion operator...

X_magic = magic_op_g.transform()
Calculating imputation...
Calculated imputation in 0.26 seconds.
Traceback (most recent call last):

File "/var/folders/59/cxr2yt4926jc95n5w2mtz32r0000gn/T/ipykernel_34361/2648698257.py", line 1, in
X_magic = magic_op_g.transform()

File "/Users/jack/opt/anaconda3/envs/py310/lib/python3.10/site-packages/magic/magic.py", line 607, in transform
X_magic = utils.convert_to_same_format(

File "/Users/jack/opt/anaconda3/envs/py310/lib/python3.10/site-packages/magic/utils.py", line 167, in convert_to_same_format
data.columns = target_columns

File "/Users/jack/opt/anaconda3/envs/py310/lib/python3.10/site-packages/pandas/core/generic.py", line 5588, in setattr
return object.setattr(self, name, value)

File "pandas/_libs/properties.pyx", line 70, in pandas._libs.properties.AxisProperty.set

File "/Users/jack/opt/anaconda3/envs/py310/lib/python3.10/site-packages/pandas/core/generic.py", line 769, in _set_axis
self._mgr.set_axis(axis, labels)

File "/Users/jack/opt/anaconda3/envs/py310/lib/python3.10/site-packages/pandas/core/internals/managers.py", line 214, in set_axis
self._validate_set_axis(axis, new_labels)

File "/Users/jack/opt/anaconda3/envs/py310/lib/python3.10/site-packages/pandas/core/internals/base.py", line 69, in _validate_set_axis
raise ValueError(

ValueError: Length mismatch: Expected axis has 1729 elements, new values have 21470 elements

The default function without a graph worked.

magic_op = magic.MAGIC()
magic_op = magic_op.fit_transform(X=X)

Any help would be appreciated.

Thank you,
Ryosuke

@scottgigante
Copy link
Contributor

Looks like a bug, but you can work around it with magic_op_g = magic_op_g.fit(X=df.to_numpy(), graph=graph)

@yamajackr
Copy link
Author

yamajackr commented Nov 2, 2022

Thanks, @scottgigante .
Your code worked, but the shape of obtained array was same as that of graph.
(Gene number was 21,000; cell number was 1,729; obtained array was 1,729 x 1.729)

Instead, I tried this.

magic_op_g = magic.MAGIC()
magic_op_g = magic_op_g.fit(X=df,  graph=graph)
diff_op_3 = np.linalg.matrix_power(magic_op_g.diff_op, 3) # t = 3
data_new = np.array(np.dot(diff_op_3, df)) 
df_new = pd.DataFrame(data=data_new, columns=df.columns.tolist())

I think it works.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants