Changelog¶
Changes in Version 2.0¶
Version 2.0 brings a cleaned-up API to forgi. The BulgeGraph object should now be treated as immuteable on the level of defines and edges. Furthermore, the new Sequence class deals with many quirks of the PDB format, including tuple-indices and missing and modified residues. We now fully support the MMCIF fileformat.
Some of the biggest changes:
All RNA file-formats can be loaded with the single function `forgi.load_rna`, which returns a list of connected components.
The old functions for creating the RNA from a specific file type are now classmethods of the BulgeGraph class.
BulgeGraph.__init__should no longer be used.The API of the Bulgegraph object was cleaned up, including the following changes:
- Functions modifying the defines and edges of the BulgeGraph (like
add_node) have been moved to the private module forgi.graph._graph_construction, as they are only used during creation of the BulgeGraph object, includingdissolve_length_one_stems. Instead,length_one_stem_basepairswas added. - To get a modified copy of a BulgeGraph, use the member methods of the GraphTransformer, accessed with
.transformed. These methods are implemented in forgi/graph/transform_graphs.py and more methods will be added in the future. - Some functions that are probably not useful for client code have been made private or removed, including:
all_connections->_all_connectionscompare_stems,compare_*: removedfind_external_loops, useget_domainsflanking_nucleotides, usedefine_a.get_any_sides,get_sides_plusremoved- Functions only called by
to_bg_stringare now private (get_*_string, e.g._get_connect_str). get_stem_directionremovedget_vertexremovednd_define_iteratorremovedsubseq_with_cutpointsremoved, use Sequence class’s__getitem__
get_node_from_residue_numis deprecated and shuld be replaced byget_elem- Better support for
forgi.graph.residue.RESIDas indices. - New properties/ methods
junctions,rods,pseudoknotted_basepairs,seq_length
- Functions modifying the defines and edges of the BulgeGraph (like
A new sequence class, which allows for indexing with integers (1-based, like the defines) and
forgi.graph.residue.RESIDtuples. Conversion between these two indices is now also delegated to this class.__getitem__is implemented to properly allow for slices with cutpoints.Some changes were made to the CoarseGrainRNA class:
- Virtual residues are now stored for loops (at the C1’ atom), if the RNA was loaded from a PDB or MMCIF file.
- Experimental functions are now made private.
- New function
get_incomplete_elementsfor elements with missing residues.- New function
rotate_translate
- PDB files can now be read, even if neither DSSR nor MC-Annotate are installed. However, the accurracy of the secondary structure may suffer in this case.
forgi.threedee.model.similarity.cg_rmsdnow takes virtual residues of loops into account.- The module
forgi.graph.numbered_dotbracketallows you to modify the dotbracket structure while keeping track of the mapping between brackets and residues.
Changes in Version 1.1¶
- We now support DSSR as an alternative to MC-Annotate. See the Installation for more information
- RESID for pdb-derived residue ids in now moved into a seperate module.
describe_cg.pynow adds the number of the multiloop in the sequence to the generated csv.- Bugfixes:
- python3 compatibility of
visualize_cg.py - windows-compatibility of
visualize_cg.py - fix count of multiloop-types in
describe_cg.py - persist cofold-breakpoints in cg-files, even if no sequence was given.
- fix bugs in parsing of MC-Annotate output. Forgi should now be able to load most pdb-files.
- properly propagate the
--keep-length-one-stemsoption.
- python3 compatibility of
Changes in Version 1.0¶
With version 1.0, we introduce support for cofold and multifold structures. Further more, we are finally compatible with Python 3 and Python 2
In the following list of the biggest changes, the most important changes are highlighted in bold.
Python 3 compatibility: forgi is now compatible with python 2 and 3. In paticular, we have automatic tests for python 3.5, 3.6 and 2.7
Logging support. We now use the
loggingmodule instead of print statements.In
forgi.graph.bulge_graph:- Cofold Structures A BulgeGraph can now hold multiple RNA chains,
if they are all connected by basepairs.
bg.seq_ids now holds named tuples (chain, resid)
New Errors (
GraphConstructionErrorandGraphIntegrityError) as specializations of ValueError are introduced and sometimes raised by BulgeGraph instances.Fasta sequences containing ‘T’ characters are now allowed (and converted to ‘U’)
Sequence class: Introduced
Sequenceclass, a string subclass with 1-based indexing and support for ‘&’ (cofold structures)define_range_iteratorno longer accepts theseq_idskeyword.The Bulge-graph now has a new method
define_a. It returns the define with adjacent nucleotides (if present) and also works for 0-nucleotide multiloops.Consistent element numbering:
- During graph creation, multiloops are now numbered from m0, m1, … according to their position along the backbone.
- The numbering of elements during bulge-graph construction is now consistent independent of the order of dictionaries (which was randomized in python 3 and is ordered in python 3.6)
- Looking at multiloops as a whole:
Multiloops continue to consist of independent multiloop segments. However, we additionally introduced methods to look at multiloops as a whole.
find_mlonly_multiloopsfinds out, which multiloop segment belong- together in a junction
describe_multiloop: reports whether the junctions found with the- previous method belong to a pseudoknot, a multiloop or an exterior loop.
get_angle_typenow supports theallow_brokenkeyword to return an angle type instread of None in case of Multiloop segments that are not part of the Minimum Spanning Treebg.iter_elements_along_backboneintroduced.
forgi.threedee.utilities.average_atom_positionswas converted to a JSONdata file, which is read by
forgi.threedee.utilities.graph_pdb. This saves time upon import. In the future, average_atom_positions might be entirely removed.
New module
forgi.threedee.model.linecloudto store the coordinate and twist vectors. This module’s classes implement a Mapping interface but internally use a numpy array, which allows for speedup of many operations.In
forgi.graph.bulge_graph:- Exceptions
CgConstructionErrorCgConstructionErrorandCgIntegrityErroras subclasses of the new bulge graph exceptions - Load all chains from a PDB file: The module level function
connected_cgs_from_pdbloads all RNA chains from a PDB file and greates a cg object for each connected component. - Bugfix: Supplying the secondary structure during loading of PDBs now works again.
- Convenience function
cg.get_statsas a wrapper aroundget_bulge_abgle_stats,get_loop_statsandget_stem_statsrespectively. - Code for steric value and stacking detection is EXPERIMENTAL!
- Exceptions
Modules
ftm.ensemble,ftm.ensemble2andftu.dssrare EXPERIMENTALFaster RMSD by using QCOrot algorithm with Cython instead of Kabsch algorithm
AngleStat class now supports unary minus operator.
Recalculated
average_stem_vres_positionsfrom the NR-list.- In
forgi.threedee.utilities.graph_pdb: virtual stats and sum of angle stats to describe orientation of non-adjacent stems in multiloops.
- In
Modified Residues: Better treatement of modified residues. We try to query PDBeChem to replace the modified residue with the unmodified parent.
Changes in Version 0.4¶
In
forgi.graph.bulge_graph:- Speed improvement: Basepair distances between elements are cached.
- The Bulge-graph object and file format supports arbitrary key-value pairs in the
infodictionary. BulgeGraph.get_connected_nucleotidesno longer sorts the output nucleotides. Now this function depends on the order of stem1 and stem2 and can thus be used to determine the direction of a bulge. This is used in the new member functionget_link_direction.- Added function
BulgeGraph.get_domains, to return a lists of pseudoknots, multiloops and rods. The interface of this function might change in the future. - Merged pull-request by tcarlile for
forgi.graph.bulge_graph:BulgeGraph.get_stem_edge(self, stem, pos): Returns 0 if pos on 5’ edge of stem, returns 1 if pos on 3’ edge of stem.BulgeGraph.shortest_path(self, e1, e2): Returns a list of the nodes along the shortest path between e1, and e2.
Restructured forgi.threedee.model.comparison and forgi.threedee.utilities.rmsd into
forgi.threedee.model.similarityandforgi.threedee.model.descriptorsThesimilaritymodule contains all functions for the comparison of two point clouds or two cg structures. Thedescriptorsmodule contains functions for describing a single point cloud, such as the radius of gyration or new functions for the gyration tensor.average_stem_vres_positionsare back with recalculated valuesChanges in
forgi.threedee.model.coarse_grainto theCoarseGrainRNAobject:In the
self.coordsdictionary, the start and end coordinate are now in a consistent order.Call new member function
self.add_all_virtual_residuesinstead offorgi.threedee.utilities.graph_pdb.add_virtual_residuesCoordinates of interior loops and multiloop segmentsa are no longer stored in the cg-files, as they can be deduced from stem coordinates.
- New member function
self.add_bulge_coords_from_stemsis provided instead of functionforgi.threedee.utilities.graph_pdb.add_bulge_information_from_pdb_chain
- New member function
Member function
self.get_virtual_residue(pos)is provided for easier access than directly viaself.v3dposs. For single stranded regions, virtual residue positions along the direct line of the coarse grain element can be estimated optionally. Virtual residues are cached and the cache is automatically cleared when the coordinates or twists of the coarse grained RNA are changed.Functions for creating coordinate arrays for the structure
self.get_ordered_stem_possfor the start and end coordinates of stems.self.get_ordered_virtual_residue_possfor all virtual residue coordinates of stems. Replacesforgi.threedee.utilities.graph_pdb.bg_virtual_residuesself.get_poss_for_domainfor coordinates only for certain coarse grained elements.
Removed the addition of a pseudo-vector to loop stats in
get_loop_stats, which was used to avoid zero-length bulges. Now 0-length bulges are possible. This makes saving and loading stats consistent.self.get_coordinates_arraynow returns a 2Dnx3 numpy arrayholding n coordinate entries. You can use numpy’s.flatten()to generate a 1D array. If you want to load a 1D flat coordinate arraya, useself.load_coordinates_array(a.reshape((-1,3))Overrides the newly introduced method
sorted_edges_for_mstfrom BulgeGraph. Now elements that have nosampledentry are broken preferedly. This should ensure that the minimal spanning tree is the same after saving and loading an RNA generated with the program Ernwin to/from a file.self.coords_to_directionsandcoords_from_directions: Export the coordinates as an array of directions (end-start). This array has only 1 entry per coarse grained element.
In
forgi.threedee.model.stats: Added class for clustered angle stats.Changes in
forgi.projection.hausdorff.Changes in
forgi.projection.projection2d- Faster rotation and rasterization.
- selected virtual residues can be included in the projection
In
forgi.threedee.utilities.graph_pdb: Added functionsget_basepair_centerandget_basepair_plane. This will be used in the future for stacking detection.
Changes in Version 0.3¶
- CoarseGrainRNA now has a member
cg.virtual_atoms()which is used for caching of virtual atom positions.forgi.threedee.utilities.graph_pdb.virtual_atoms()now only calculates atom positions when they are needed. The two changes together lead to a great speed improvement. - The subpackage
forgi.visualwas started for easy visualization of RNA using fornac or matplotlib. This subpackage is in an early development stage and will be improved in future versions. - The subpackage forgi.projection was added to work with projections of CoarseGrainedRNA objects onto a 2D plane.
- Now
forgi.threedee.model.average_atom_positionsis used for all virtual atom calculations whileaverage_stem_vres_positionshas been removed. This leads to more consistent virtual atom calculations. Further more the values inaverage_atom_positionshave been recalculated. - In
forgi.threedee.model.rmsd, the functionscentered_rmsdandcentered_drmsdhave been deprecated and deleted respectively. Usermsd(cg1,cg2)for a centered RMSD. This removes code duplication. - In
forgi.threedee.model.comparisona ConfusionMatrix class was introduced for speedup with repeated comparisons to the same reference. - Several smaller changes and improvements