Implemented classes

Autoencoder

Matrix

class src.matrix.Hic(cooler, *args, **kwargs)[source]
class Hic
This class inherits the Matrix class and set the matrix numpy array for a Hi-C data.
cooler

Storage of the Hi-C matrix

Type:cooler
calculate_cum_length()[source]

Calculates and returns the cumulated length from chromosome 1 to N.

Returns:Informations on chromosomes, their length and cumulated length
Return type:Pandas DataFrame
set_matrix()[source]

Set the Hi-C numpy array of the chromosome chrom_num. The matrix is transformed into an upper triangular matrix and the values are converted in float32 and rescaled by log10 and normalized.

class src.matrix.HistoneMark(bed_file, *args, **kwargs)[source]
class HistoneModification
This class inherits the Matrix class and set the matrix numpy array for a histone mark.
mark_df

Histone modification sparse matrix

Type:Pandas Dataframe
set_matrix()[source]

Set the histone modification numpy array of the chromosome chrom_num. The values of the matrix are converted in float32 and rescaled by log10 and normalized.

class src.matrix.Matrix(resolution, chrom_num, side)[source]
class Matrix
This class stores a matrix and different related numpy array, plots and writes this matrix.
resolution

Resolution (or bin size) of the matrix

Type:int
chrom_num

Chromosome chosen for processing

Type:int
side

Square side of a numpy array sub-matrix

Type:int
matrix

Matrix stored in a numpy array

Type:numpy array
sub_matrices

The matrix is divided into S sub-matrices of size side*side and stored in a numpy array of shape (X, side, side, 1)

Type:numpy array
white_sub_matrices_ind

Position of the blank sub-matrices

Type:list
total_sub_matrices

Total number of sub-matrices

Type:int
latent_spaces

Latent spaces (encoded sub-matrices) stored in a numpy array

Type:numpy array
predicted_sub_matrices

Predicted sub_matrices (decoded latent spaces) stored in a numpy array

Type:numpy array
plot_distribution_matrix(matrix_type, path)[source]

Plot the distribution of the matrix.

Parameters:
  • matrix_type (str) – Matrix’s name
  • path (str) – Path of the output plot
plot_matrix(matrix_type, color_map, path)[source]

The matrix is plotted in a file.

Parameters:
  • matrix_type (str) – Matrix’s name
  • color_map (matplotlib.colors.ListedColormap) – Color map
  • path (str) – Path of the output plot
plot_sub_matrices(matrix_type, index_list, color_map, path)[source]

40 random sub-matrices are plotted in a file.

Parameters:
  • matrix_type (str) – Matrix’s name
  • index_list (list) – List of the 40 sub-matrix indexes to plot
  • color_map (matplotlib.colors.ListedColormap) – Color map
  • path (str) – Path of the output plot
set_predicted_latent_spaces(latent_spaces)[source]

Set the latent spaces predicted by the encoder.

Parameters:latent_spaces (numpy array) – The predicted latent_spaces
set_predicted_sub_matrices(predicted_sub_matrices)[source]

Set the sub-matrices predicted by the whole autoencoder.

Parameters:predicted_sub_matrices (numpy array) – The predicted sub-matrices
set_sub_matrices()[source]

Divide the matrix into S sub-matrices of size side*side. The empty sub-matrices (sum(values)==0) are removed from the data set. The S resulted sub-matrices are stored in a numpy array of shape (X, side, side, 1).

write_sparse_matrix(matrix_type, path)[source]

The reconstructed and predicted Hi-C matrix is saved in a sparse matrix file.

Parameters:
  • matrix_type (str) – Matrix’s name
  • path (str) – Path of the output

Interpolation

class src.interpolation.Interpolation(alphas)[source]
class Interpolation
This class groups attributes and functions which aim to construct, write in a sparse matrix
and plot two or several interpolated matrices.
alphas

List of float values to use for the interpolation (alpha parameter)

Type:list
interpolated_submatrices

List of all the interpolated sub-matrices. Each item in the list contains an interpolation with a different alpha.

Type:list
integrated_matrix

List of all the integrated (interpolated) reconstructed matrices. Each item in the list contains an interpolation with a different alpha.

Type:list
construct_integrated_matrix(hic)[source]

Construction of the whole integrated matrices from the interpolated sub-matrices.

Parameters:hic (Hic(Matrix) object) – Hi-C matrix
plot_integrated_matrix(hic, color_map, path)[source]

The integrated matrices are plotted for each alpha value.

Parameters:
  • hic (Hic(Matrix) object) – Hi-C matrix
  • color_map (matplotlib.colors.ListedColormap) – Color map
  • path (str) – Path of the output plot
plot_interpolated_submatrices(hic, index_list, color_map, path)[source]

40 random integrated sub-matrices are plotted for each alpha value.

Parameters:
  • hic (Hic(Matrix) object) – Hi-C matrix
  • index_list (list) – List of the 40 sub-matrix indexes to plot
  • color_map (matplotlib.colors.ListedColormap) – Color map
  • path (str) – Path of the output plot
write_predicted_sparse_matrix(hic, path, threshold=0.0001)[source]

The integrated matrices are saved in sparse matrix files for each alpha value.

Parameters:
  • hic (Hic(Matrix) object) – Hi-C matrix
  • path (str) – Path of the output
  • threshold (float) – The values under the threshold will be set to 0
class src.interpolation.InterpolationInLatentSpace(*args, **kwargs)[source]
class InterpolationInLatentSpace
This class inherits the Interpolation class and interpolate sub-matrices in the latent space
interpolate_latent_spaces(hist_marks, hic_latent_spaces)[source]

Double linear interpolation of the latent spaces of the Hi-C and histone marks.

Parameters:
  • hist_marks (dict) – Dictionary containing all histone mark HistoneMark objects.
  • predicted_hic (numpy array) – Predicted sub-matrices of the Hi-C
set_decoded_latent_spaces(decoder, side)[source]

The interpolated latent spaces are decoded.

Parameters:
  • decoder (keras model object) – Hi-C matrix
  • side (int) – Square side
class src.interpolation.NormalInterpolation(*args, **kwargs)[source]
class InterpolationInLatentSpace
This class inherits the Interpolation class and interpolate sub-matrices in the pixel space
(= without the use of encoder and decoder).
alphas

List of float values to use for the interpolation (alpha parameter)

Type:list
interpolated_submatrices

List of all the interpolated sub-matrices. Each item in the list contains an interpolation with a different alpha.

Type:list
integrated_matrix

List of all the integrated (interpolated) reconstructed matrices. Each item in the list contains an interpolation with a different alpha.

Type:list
interpolate_predicted_img(hist_marks, predicted_hic)[source]

Double linear interpolation of the predicted sub-matrices of the Hi-C and histone marks.

Parameters:
  • hist_marks (dict) – Dictionary containing all histone mark HistoneMark objects.
  • predicted_hic (numpy array) – Predicted sub-matrices of the Hi-C